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ABSTRACT 

Clusters of galaxies are excellent locations to probe the distribution of baryons and dark matter 
over a wide range of scales. We study a sample of 7 massive (A/200 = 0.4 — 2 x 10 15 M Q ), relaxed 
galaxy clusters with centrally-located brightest cluster galaxies (BCGs) at z = 0.2 — 0.3. Using the 
observational tools of strong and weak gravitational lensing, combined with resolved stellar kinematics 
within the BCG, we measure the total radial density profile, comprising both dark and baryonic matter, 
over scales of ~ 3 — 3000 kpc. We present Keck spectroscopy yielding 7 new spectroscopic redshifts 
of multiply-imaged sources and extended stellar velocity dispersion profiles of the BCGs. Lensing- 
derived mass profiles typically agree with independent X-ray estimates within ~ 15%, suggesting that 
departures from hydrostatic equilibrium are small and that the clusters in our sample (except A383) 
are not strongly elongated or compressed along the line of sight. The inner logarithmic slope 7 to t 
of the total density profile measured over r/r 2 oo = 0.003 — 0.03, where p to t oc r~ 7tot , is found to be 
nearly universal, with a mean (7tot) = 1.16±0.05 (random) 1q'q7 (systematic) and an intrinsic scatter 
<r 7 < 0.13 (68% confidence). This is further supported by the very homogeneous shape of the observed 
velocity dispersion profiles, which are mutually consistent after a simple scaling. Remarkably, this 
slope agrees closely with high-resolution numerical simulations that contain only dark matter, despite 
the significant contribution of stellar mass on the scales we probe. The Navarro-Frenk- White profile 
characteristic of collisionless cold dark matter is a better description of the total mass density at 
radii ^ 5 — 10 kpc than that of dark matter alone. Hydrodynamical simulations that include baryons, 
cooling, and feedback currently provide a poorer match. We discuss the significance of our findings for 
understanding the physical processes governing the assembly of BCGs and cluster cores, particularly 
the influence of baryons on the inner dark matter halo. 

Subject headings: dark matter — galaxies: elliptical and lenticular, cD — gravitational lensing: strong 
— gravitational lensing: weak — X-rays: galaxies: clusters 



1. INTRODUCTION 

In a cold dark matter (CDM) universe, dark matter 
(DM) halos are expected to be nearly self-similar, and 
their detailed structure can be followed in l arge numer- 
ical simulation s based only on gravi t y (e.g., Dubinski & 
' Cariberg||1991| |Navarro et al.||l996l |Moore et al.j|l9^r 
Ghigna et al.||2000HSpringel et al.| | 2005| IDiemand et al 



2005 ; INavarro et al 
J L 



2010; IGao et aJ 



. 2012). 
allapse 



A key result 

of cold, collisionless gravitational collapse is the forma- 
tion of a central density cusp with a characteristic profile 
Pdm oc r _1 . At large radii the density falls as pum oc r~ 3 . 
These slopes are characteristic of the Navarro-Frenk- 
White (NFW) profile, which provides a reasonable de- 
scription of results from A-body simulations. With im- 
proved resolution, recent simulations have elucid ated de- 
viations from this simple functional form (e.g., Merritt 



Email: ancwmairaastro.caltcch.edu 



1 CahillCenter for Astronomy fc Astrophysics, California In- 



stitutc of Technology. MS 249-17. Pasadena. CA 91125 

21 



DeparTm^^^^F^^Icsr^^iTOrsIl^^r^alllornia, Santa Bar- 



bara. CA 93106 



^ Las Cumbrcs Observat ory Clobal Telescope Network, Santa 



Barbara. CA 93117 

4 Astronomy Department, Universi ty ot Bologna, via Kanzam 



-10127 Boloena, Italy 



Ch. Andre. 69561 Saint Ccnis Laval Ccdex, 1'rancc 

Laboratoirc d'Astrophysiquc dc Marseille, Universitc d'Aix- 
Marscille & CNRS, UMR7326, rue R Jollot-Curie, 13388 



Marseille Ccdex 13, France 



= in clusters of galaxies ( Peiram et al.|2008 Martizzi et al 



et al.||2006| |Navarro et aL]|2010 |Gao et al.||2012| ), show- 
ing that halo profiles are not strictly self-similar and that 
the density slope likely becomes slightly shallower at very 
small radii. 

Real halos also contain baryons that may significantly 
modify the structure of the DM. Cooling allows baryons 
to condense toward the center, 
more concentrated 



M 



whic h makes the DM 
19861 iGnedinet al. 



2004; |Sellwoodfc 
Pedrosa et al. 12009 



Blumenthal et al 

cGaugh 2005 ; Gustafss on et al.|%0"i 
; Abadi ct al. 2010; Sommer-Larsen 



Limousin 2010 k" Additional baryonic effects have been 
proposed to reduce the central concentration, even pro- 
ducing DM cores. These include heating of the central 
cusp via dynamical f riction with infalling sate l lites (e.g 



El-Z ant et al.||2001[ [20041 |Nipoti et al. 
Dia z et al.)20081|Jardel fc Sellwood|20097 



2004; |Romano- 
Johansson et al. 



2009||Dcl Popolo 2012H , feedback from sup ernovae in low- 



mass galaxies (Govcrn ato et aLj|2010| Pontzen & Gover 
nato||2012[ |Brooks fc Zoloto v | |2L)12lh and AGJN feedback 



2012[ ) . Much effort have been devoted to understanding 
the net result of these competing effects on halos using 
comprehensive hyd rodynamical simulations over a, rang e 
of mass scales (e.g., Duffy et al. 2010 Gnedin et al. 2011 ), 
but due to the difficulty of realistically treating all the 
relevant physics, predictions for halos with baryons re- 
main unclear. 

Understanding the relative distribution of dark and 
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baryonic matter is important for several reasons. If CDM 
is an accurate description, then the structure of real ha- 
los can inform us about the assembly of galaxies, groups, 
and clusters through the imprint of baryons on their ha- 
los. For instance, dark and baryonic density profiles can 
inform us about the relative importance of d issipational 
and dissipatio nless assembly processes (e.g., Lackner & 
Ostriker|2010 ). The observation from that massive ellip- 
ticals have nearly isothermal total mass profiles within 
their effective radii - with very little scatter — is a strong 
constraint o n their formation and evolution ( jKoopmans 
et al. 2009). On the other hand, the structure ot ha- 



inner halo is distinct from CDM (e.g., 


Spergel & Stein- 


hardt 2000| Dave et al. 2001 


Kuzio de JNaray et al.|2010 


Maccio et al. 20121), if the 


baryonic effects can be bet- 



indirect DM searches, since the rate of gamma ray pro- 
duction from annihilation scales as /9q M . 

Determining precise and robust mass profiles is chal- 
lenging, particularly if the goal is to separate the dark 
and baryonic components. Low surface brightness and 
dwarf spheroidal galaxies are often considered ideal tar- 
gets for DM studies, since the mass fraction of baryons 
is minimal, and observations indicate that many of these 
galax ies have a DM core rather than the expected cusp 





Simon et al.|2005 


de Blok et al.|2008 


Wolf & Bui- 


lock 'I 


012|). Due to their shallow potential wells, however, 



pernovae (see references above). 

Galaxy clusters are also promising systems for detailed 
study of of mass distributions. Although they are more 
complex systems than dwarf galaxies, the infl uence of 
baryons is possibly weaker and simpler (e.g., Gnedin 
et al. 2004). Owing to the wide range of observational 
tools that can be brought to bear, the mass in individ- 
ual clusters can be measured in detail over a very wide 
range of scales. Clusters are DM-dominated outside of 
the very central regions and are the only systems that can 
be individually mapped to their virial radius, using weak 
gravitational lensing. In selected clusters, strong lensing 
provides exquisite mass measurements that are indepen- 
dent of the dynamical state. X-ray emission from the 
hot intracluster medium can also be used to derive mass 
profiles under the assumption of hydrostatic equilibrium. 
Each of these tools is valid over a specific radial interval. 
Weak lensing cannot reach within ~ 100 kpc. The strong 
lensing zone is usually confined to roughly 30 — 150 kpc 
(partly due to the difficulty in locating central images 
superposed on cluster galaxies). X-ray emission is dif- 
ficult to interpret within ~ 50 kpc due to gas cooling 
and substructure, while temperature measurements be- 
come prohibitive at > 700 kpc. Therefore, combining 
several mass probes is necessary to derive comprehensive 
constraints. 

X-ray and lensing studies have shown that NFW pro- 
files can generally provide adequate descriptions of clus- 
ter halos at radii r > 50 kpc (e.g., |Kneib et al.) 2003; 
Broadhurst et al.|2005||Mandelbaum et al.|2006[|ScnmIdtl 
fc Allen|2007HOkabe et al.|2010||Umetsu et al.|2011| |Coe| 
et al. 2012[ jMorandi fc Limousin||2012[) . Several studies 
have questioned whether the relationship between halo 
mass and concentration, derived based on NFW models, 
follows that in simulations. Many lensing clusters have 



surprisingly high concentrations (e.g., Kneib et al. | |2003 - 
n ' Zitrin et al.|20Tlb[ ). Interpreting 



Broadhurst et al. 2008 



this requires carefu l study ot possible measurement biases 



or sele ction effects ( Hennawi et al.|2007||Meneghetti et al 
|2010a[ ). Measuring the shape ot the radial density profile 
to test whether the NFW form (or the result of numerical 
simulations generally) is valid over the full range of scales 
- for any mass and concentration - is more challenging, 
but possibly more profound. The tools mentioned so far 
cannot test for deviations from an NFW profile in the in- 
ner halo with much statis tical power, even when multiple 
clusters are stacked (e.g., Schmidt k. Allen|2007| |Umetsu| 



et al. 



2011 1, except possibly in rare lensing configura- 

More constraints on 



Limousin et al. 2007) 



tions (e.g., 

smaller scales are necessary to provide a lever arm long 
enough to measure the inner density slope and probe the 
innermost decade in radius now resolved in the best sim- 
ulations. Here the stellar mass in the brightest cluster 
galaxy (BCG) is significant, and there has often been 
confusion in the literature about whether the total den- 
sity or only that of the dark matter is being reported and 
compared to simulations. 

In relaxed clusters hosting a BCG that is closely 
aligned with the center of the halo, the kinematics of 



the stars trace the total gravitational pot ential ( Miralda- 



Escude 


1995 




using 8 


- id 





Natarajan & Kneib 1996). Spectroscopy 
m telescopes can reach from the stellar- 
dominated regime to the regime where DM is dynam- 
ically significant, even at the cosmologica l distances of 



lensing clusters (z > 0.2). |Sand et al.] ( |2002| |2004 1 
showed that by combining strong lensing with stellar 
kinematics, the contribution of the stellar mass can be 
constrained, which allows the DM halo to be isolated and 
its inner slope measured. Particularly strong results were 
obtained in clusters presenting radial arcs. In 5 of the 6 
clusters they studied, the inner logarithmic density slope 
fi = — dlogpDM/rflogr was foun d to be < 1, shal lower 
than a standard NFW profile. iSand et al.l (|2008|) im- 



proved the analysis in two clusters by constructing a two- 
dimensional lens model and found similar results. INew-l 



man et aL] ( |2009[ N09) additionally incorporated weak 
lensing constraints in A611, providing the first cluster 
mass profile over 3 decades in radius. |Newman et al.| 
(2011 Nil) presented very extended stellar kinematics 
in A383 and additionally used X-ray observations to as- 
sess the non-spherical geometry of the cluster along the 
line of sight. In both A611 and A383 we confirmed a 
shallow inner density slope < 1 for the DM. 

In this paper we present strong lensing, weak lensing, 
and resolved stellar kinematic observations for a sample 
of 7 massive, relaxed galaxy clusters. The clusters span 
the redshift range z — 0.2 — 0.3 and have virial masses 
M 2 oo = 0.4 - 2 x 10 15 M o . Taken together, these data 
span scales of ~ 3 — 3000 kpc, which is well-matched to 
the dynamic range achieved in modern ./V-body simula- 
tions of clusters. We use these data to constrain the total 
density distribution over 3 decades in radius, providing a 
benchmark for high-resolution simulations. We focus on 
the shape of total density profile in this paper and show 
that it is in surprising agreement with numerical simu- 
lations that contain only dark matter. In Paper II of 
the series, we consider the DM and stellar mass profiles 
separately. 

The plan of the paper follows. In Section 2 we in- 
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TABLE 1 

Cluster Sample and Alignment between BCG and Mass Centers 













BCG offset (kpc) from 










BCG peculiar 


Source of 


X-ray Lensing center 


Cool Lx 


Name 


<*> 




velocity (km s -1 ) 


galaxy redshifts 


centroid Ax Ay 


core? (10 37 W) 



MS2137.3-2353 

A963 

A383 

A611 

A2537 

A2667 
A2390 



0.314 
0.206 
0.190 
0.288 
0.294 

0.233 
0.229 



26 
236 
273 

22 
52 



-261 ± 187 

-67 ± 68 
-325 ±311 

438 ± 730 
270 ± 218 



This work 
Sand et al (in prep. 



Braglia et al. 



I 2009 I 



and this work 



C ovone et a l. ( 2006 I 
Vee et al.|P55^ 



1+ 
6 
2 
1 
13t 

3 
2 



1.2 ±0.8 

-2.7 ±0.6 
-1.3 ±0.9 
-0.4 ± 1.2 

-6.5 ± 3.6 
4.9 ±6.7 



0.1 ± 0.6 

2.9 ± 1.1 
4.2 ±0.8 
5.2 ± 1.5 

4.1 ± 2.9 
-0.2 ± 3.5 



Yes a 
No 6 
Yes 6 
No 6 
No c 

Yes 6 
Yes 6 



11.10 
5.03 
4.12 
5.33 
9.37 

11.97 
14.81 



Note. — Redshifts (z) are the biweight mean of N s& \ cluster galaxies identified with an iterative 2.5(7 clip applied. The uncertainty is 
provided on the peculiar velocity of the BCG [«BCG = c ( 2 clus — ZhogVO ± 2 clus)]- F° r A2537 the redshift and «bcg are given relative 
to the main peak (Figure [2j. Where no redshift survey is availab le, the redshift of the B CG is given instead. Offs ets between the BCG 
and the X-ray centroid measured in the central 1 arcmin are from [Sanderson et al. (2009) and Rich ard et al.| J2010;), except those marked 
t which are original to this work. Offsets between the BCG and lensing center are discussed in Section |7.3 | Ax > and Ay > denote 

- 2.4 keV band within -R500 from Piffarctti ct al. (20lT|l. Sources 



offsets west and north of th e BCG. Lx is the X-ray lum in osity in the 0.1 



of cool core classification: 



Donnarumma et al. 



(2009), 



Richard et al. 



( 2010| l, c |Rossetti et al.| ( |2011 1 



troduce the cluster sample and describe its characteris- 
tics. In Section 3, technical aspects of the weak lens- 
ing analysis, based primarily on Subaru imaging, are 
presented along with shear profiles and two-dimensional 
mass maps. Section 4 describes our strong lensing in- 
terpretations, including 7 new spectroscopic redshifts of 
multiply imaged sources. In Section 5 we present Hubble 
Space Telescope (HST) surface photometry of the BCGs 
and stellar population synthesis models. Spectroscopy of 
the BCGs and the derived kinematic measures are dis- 
cussed in Section 6. In Section 7, we present the mathe- 
matical framework used to derive our mass profiles. Sec- 
tion 8 compares our lensing-derived mass profiles to inde- 
pendent X-ray measures, in order to assess the possible 
influence of projection effects on our results. Finally, 
in Section 9 we present the total mass profiles derived 
for the full sample, focusing particularly on the total in- 
ner slope, and in Section 10 we discuss our results in the 
context of recent simulations. Section 11 summarizes our 
findings. Readers interested only in the results and not 
the technical aspects may wish to begin in Section 8. 

Throughout we adopt a ACDM cosmology with f2 m = 
0.3, n A = 0.7, and H = 70 km s" 1 Mpc" 1 . At z = 
0.25, 1" = 3.91 kpc. Magnitudes are reported in the AB 
system. 

2. THE CLUSTER SAMPLE 

Our goal is to fit simple parametric models to lensing 
and kinematic data on scales ranging from ~ 3—3000 kpc 
and to compare our results to simulations. This requires 
selecting a sample of clusters that are reasonably relaxed 
and symmetric, both to ensure that our models are ade- 
quate and to make clean comparisons with theory. Fur- 
thermore, our use of stellar kinematics to trace the mass 
distribution on small scales requires that the centers of 
the BCG and DM halo are well aligned. Table [l] in- 
troduces the sample of 7 massive clusters, which range 
in redshift from z = 0.19 to 0.31. As we describe be- 
low, A611, A383, MS2137, A963, and A2667 are well- 
relaxed clusters, A2390 is likely only slightly perturbed, 
and A2537 shows signs of a more complex mass distribu- 
tion. 

Optical images of the central ~ 1 Mpc of each cluster 



are shown in Figure[T]with X-ray contours overlaid. The 
X-ray data were obtained from the Chandra archive^] 
and point sources were removed using the CIA0 tools. 
We first discuss A611, A383, MS2137, A963, and A2667, 
which are prototypically relaxed clusters, and reserve 
A2390 and A2537 for individual comments below. The 
X-ray emission in these 5 clusters is regular, symmetric, 
and well-aligned with the BCG, and is extended along 
the same directions as both the BCG and our lensing- 
derived mass models. The alignment is quantified in 
Table [l] which shows that the X-ray centroid is typi- 
cally within a few kpc of the BCG, comparable to the 
measurement uncertainty (A. Sanderson, private com- 
munication) . Similar small offsets between the BCG and 
center of mass are derived from lens models, which we 
discuss further in Section [7.31 

It is unlikely that we have simply selected clusters in 
which the BCG is offset primarily along the line of sight, 
given that these clusters exhibit many characteristics 
that are known to be correlated with a relaxed state and 
a centrally-located BCG: a large luminosity gap between 
the BCG and the second rank galaxy, a low s ubstructure 
fraction, and the presence of a cooling core ( San derson 
et al 



J2009 

thermore, 



Smith et al.||2010 

the available reds! 



i| |Richard et"aLl|2010[ ). Fur- 
hrrti - 



s in the fields of A383, 
A611, and A2667 (see sources Table [l} are consistent 
with a unimodal velocity distribution in which the BCG 
is at rest in the cluster potential, as shown in Figure [2] 
A2390 shows slightly more complicated X-ray emission 
that is characterized primarily by a low-level extension to 
the northwest on ~ 200 kpc scales, in the same location 
as an enhanceme nt of cluster galaxies. The extension has 
long been noted ([ Kassio la et al.||1992| |Pierre et al.||1996 



Frye fc Broadhurst|[l99"8[ ). As we discuss in Section 7.1 
our strong lensing model does not demand a major ad- 
ditional mass concentration in this region, provided an 
elliptical halo is used. Further, the X-ray and galaxy dis- 
tributions are regular on larger scales, the BCG is well 
aligned with the X-ray and lensing centers (Table flj, the 
velocity distribution of cluster galaxies is unimodal and 

7 Observation IDs 3194. 2320, 4974, 903, 2214, 4962, 9372, and 
4193. 
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-3.56 



-3.58 
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120.20 



120.15 




-26.04 



-26.06 
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357.85 




17.64 



328.45 



328.40 



328.35 



Fig. 1. — Color c omposites of the cent ral regions of each cluster based on the imaging data introduced in Section |3.1| arc displayed with 
an arcsinh stretch QLupton et al.|[2004| ) . Only a small portion of the total field of view is shown. The Chandra X-ray emission in the 
0.8 — 7 keV band is overlaid, smoothed with a Gaussian kernel whose size (FWHM of 20") is indicated in the lower-left of each panel. 
Contour levels are equally spaced logarithmically but are otherwise arbitrary. Axes show the R.A. and Declination. 
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-4000 -2000 2000 4000 6000 
Velocity relative to BCG [km s~ 1 ] 

Fig. 2. — Histogram of velocities of cluster galaxies relative to the 
BCG, Av = c(z — zbcg)/(1 + 2 bcg); based on the sources listed 
in Table [TJ The available data are consistent with the BCGs being 
at rest in the cluster potentials. A2537 has a bimodal velocity 
structure: the BCG coincides with the primary peak, but there is 
a second peak at Av ~ 2000 km s _1 as discussed in the text. 

centered on t he BCG ( Figure [2])^ and there is a strong 
cooling core ( |Richard et al. 



maps and tangential shear profiles. 

3.1. Data reduction and catalog construction 

The imaging data used in our weak lensing analysis 
are listed in Table [2] Most observations w ere conducted 
with SuprimeCam (Miyazaki et al. 2002) at the Subaru 
telescope, either by the authors or using archival data. 
Its 30' field of view is well-matched to our sample. In a 
few cases additional color information is provided from 
our own observations at the Magellan Observatory or via 
archival data from the Canada-France-Hawaii telescope 
(CFHT). 

The data were reduced followingthe procedures de- 
scribed in N09 that use th e IM CATQbased p i peline de- 
veloped by IDonovanl (|2007|) and ICapak et all ( 20071) . In 



particular, we note that the sky subtraction scheme de- 
scribed in these works is effective at removing small-scale 
structure from scattered light. Halos around bright stars 
were carefully masked. All filters observed for a given 
cluster were reduced onto a stereographic projection with 
a common tangent point and pixel scale o f / /2. Abso- 



lute. astrometry was tied to the USNO-B (Monet et al. 
2(n0l From this we infer 20031 or SDSS DR7 or DR8 ( Abazajian et al"P009 [) cata- 



that A2390 is likely to be only mildly unrelaxed. 

Finally, we consider A2537, which is the most likely 
disturbed cluster in our sample. The X-ray emission 
is regular and symmetric, but centered slightly north 
of the BC G (13 kpc). There is no cool core ( |Ros-| 
setti et al.||2011|). The curvature of the arcs suggests 



that a second ma ss concentration may be present to 
north (Section 7.1 1. Crucially, the distribution of cluster 
galaxy velocities appears bimodal (Figure [2]) , with the 
main peak centered on the BCG and a second peak at 
At; ~ 2000 km s _1 . Galaxies in the high- velocity tail 
do not appear spatially distinct from the remainder. It 
is possible A2537 has not fully relaxed from a merger 
near the line of sigh t (perhaps similarly to 00024+1654; 
Czoske et al. 2002). Throughout, we bear in mind the 
uncertain dynamical state of this cluster when interpret- 
ing our results. 

3. WEAK LENSING 

We begin our discussion of the data forming the basis 
of our analysis on the largest scales. These are probed by 
weak gravitational shear, the systematic distortion in the 
shapes of background sources by the cluster. Weak lens- 
ing analyses present a number of technical challenges. 
Proper handling of the point spread function (PSF) of 
the instrument used for the observations is essential, 
since it induces spurious shear of comparable magnitude 
to the real signal and varies across the focal plane. Ad- 
ditionally, galaxies located behind the cluster must be 
isolated in order to avoid dilution of the shear signal by 
unlensed cluster galaxies and those in the fore grou nd: 
this requires multi-color photometry. In Section |3.1| we 
introduce the imaging data, primarily fr om t he Subaru 
telescope, and its reduction. In Section 3.2 we briefly 
describe our technique for extracting the shear signal, 
which was discussed more extensively in N09, and ver- 
ify our method using simulated data. Section |3.3| de- 
scribes the photometric redshift measurements used to 
select backgroun d sou rces and tests of their validity. Fi- 



logs. The frame-to-frame scatter in the final positions of 
bright stars was typically 3 — 5 mas per coordinate. Ob- 
ject detection and shape measurements were conducted 
in the R band image (/ ban d in A963) in the native 
seeing. We used SExtractor ( |Bertin fc Arnouts 1996| ) 
for detection, adopting a low threshold (DETECT .THRESH 
= 0.75, DETECT_MINAREA = 9); further selection criteria 
are described in Section l3~2l Colors were measured in 2" 
apertures by running SExtractor in dual-image mode 
on PSF-matched mosaics. 

For all clusters except A2667 and MS2137, photomet- 
ric zeropoints were determined through comparison with 
stellar photometry in the SDSS. This has the merit of 
uniform and accurate calibration when including archival 
data for which conventional standard star images may 
not be available and observing conditions are uncertain. 
Galac t ic ext inction was then corrected using the[Schlcgel 



et al (fl998|) dust maps. Transformations of stellar col- 



ors from th e SDSS to the SuprimeCam filter s ystem were 



taken from Capak et al. | ( |2007| ), |Yagi et al| ( |2010| , and 
Shim et al. 1 2006 ) where possible! For the remaining 
filters, transformations were deriv ed from fits t o syn- 
thetic photometry of stars in the Pickles (1985) spec- 
trophotometric library, based on filter and instrument 
response curves provided by the observatories. We ver- 
ified that this method yields transformation equations 
consistent with the empirical equations referenced above, 
and also with zeropoints derived from a landolt stan- 
dard field (N09) within a few percent. BVRIz photom- 
etry in MS2137, which is outside the SDSS DR8 foot- 
print, was calibrated through alignment with the stellar 
locus in A611 and A2390, taking adva ntage of the featur e 



in the VRI color-color diagram (e.g., High et al. 2009). 
Zeropoints for A2667 were taken from observations of 
other clusters on the same night, with small shifts ap- 
plied based on the stellar locus. Below we evaluate the 
accuracy of this calibration based on the derived photo- 
metric redshifts. 



nally, in Section 3.4 we present two-dimensional mass 



8 http : //www . if a.hawaii . edu/$\sim$kaiser/imcat/ 
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TABLE 2 

Imaging Observations for Weak Lensing Analysis 



Cluster 


Instrument 


Filter 


Dates of observation 


Exposure 


Seeing 


Depth 




n bkg 










time (ks) 


(FWHM, ") 


(mag, 5<r) 




(arcmin - 2 ) 


MS 91 ^7 


SO 


]j 


9007 OS 1 3 


1.4 


0.75 


26.5 








O k J 


1/1 

V ' 


9007 11 1 A 
ZUU ( -1 1-14 


1 9 


101 
l.Ul 


9^ 8 
ZO.o 








sc 


R 


2007-11-13 


2.4 


0.68 


26.2 


0.563 


17.8 






j 1 


0007 11 1/1 
^UU (-11-14 


"1 Q 

1 .a 


u.yo 


9£ A 










3 + 


zUU ( -U ( -lo 


1.0 


n a 1 
U.ol 


9/1 






A963 


nPTTT 19K 




1 QQQ 11 1 ^ 17 


7.2 


0.90 


26.3 








SC 


v 


2000-11-28 


1.8 
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3.2. Shear measurement and source selection 
Galaxy shape measu rements were performed based on 
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the Kaiser et al.| ( 1995 KSB95) method as implemented 



were described in the Appendix of N09. We have im- 
plemented a few minor changes to this procedure. First, 
the stellar anisotropy kernel ij* has been computed for 
a grid of Gaussian window functions of varying widths, 
rather than a single size. The smooth variation of q* a 
across the detector was well-fit by an 5th degree poly- 
nomial in the pixel coordinates x and y. (We refer to 
N09 for a demonstration of the quality of the PSF cor- 
rection, which is similar for other clusters.) When raw 
cllipticities are corrected for the PSF anisotropy, the fit- 
ted <7* are interpolated to match the window function 
width appropriate to each galaxy, which we take as its 
SExtractor FLUX_RADIUS, or r h . Second, rather than 
fitting the shear polarizability P 1 as a function of galaxy 
properties, we use the individual measurements for each 
galaxy. Third, selected galaxies are equally weighted in 
our shear analysis. We found that these small modifi- 
cations led to slightly better performance (a calibration 
factor closer to unity) when the shear pipeline is tested 
on simulated data, as described below. 



From the SExtractor catalog described in Section 3.1 
we selected resolved, well-detected galaxies for our shear 
analysis via the following criteria: (1) S/N > 7, where 



in the IMCAT software package. The details of this pro- 
cedure, including modeling and correction for the PSF, 



S/N is the detection significance defined in Erben et al. 
(2001 1 measured with a window function having a = 77, , 
(2) 1.1577,* < r/j < 6 pixels, where r* h is the median 



Stellar FLUX_RADIUS, 

galaxies, (3) \e\ < 



to avoid unresolved and very large 
1, \g\ < 1.5, tr P sm > 0, and 
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0.15 < P 1 < 2, to exclude sources with pathological 
moments, (4) MAG_AUT0 > 21, (5) to eliminate blended 
and asymmetric galaxies, a distance of at least 6 pixels 
to the nearest object, a distance of at least 3(r/j.i +7*^,2) 
to any other object > 3 mag brighter, and a shift of less 
than 1 pixel between centroids measured with and with- 
out the window function (d in N09), and finally (6) a 
photometric redshift selection described below. 

We verify and calibrate the shear pipel ine using sim- 
ulate d images from the STEP2 project (Massey et al. 
2007), which were designed to mimic the depth, sam- 
pling, and PSF typical of SuprimeCam data (Figure 3l). 
For PSF A (FWHM = Of! 6), we find a linear relation be- 
tween simulated and recovered shear with a slope of 0.89, 
averaged between shear components, and negligible addi- 
tive bias. Very similar results hold for PSF C (Of! 8), lead- 
ing to a mean calibration factor m\vL = fl'meas/fl'true — 
0.89 ± 0.01. This is typical of other authors and meth- 
ods. Although STEP2 does not extend to the shears 
g = 0.2 — 0.3 that we measure near cluster centers, the 
tight linearity in Figure [3] gives us confidence that the 
shear pipeline is working well and that an extrapolation 
of the calibration factor to higher shear is reasonable. 

3.3. Photometric redshifts 

We estimate photometric redshifts of all sources in or- 
der to select those located behind the clusters. This 
technique makes use of all the information ava ilable in 
the m ulti-color photometry. We use the BPZ ( jBemtez 
2000| software (version 1.99.3) with its CWWSB4 set of 
8 templates and the default prior. BPZ provides both a 
marginalized redshift probability density P(z) as well as 
a point estimator z b . We use both and define z phot = z b 
below. For 5 of the 7 clusters in our sample, the spec- 
troscopic redshift and the peak z p hot, as measured from 
bright galaxies in the cluster core, agreed with a scat- 
ter of crAz/(i+z) = 0.02. This supports the qu ality of 
the photometric calibration described in Section |3.1| In 
A2537 and A2667, the peak z phot is too high by ~ 0.1. 
This is not surprising, since these clusters are the only 
two observed through only 3 filters, and these do not 
closely bracket the 4000 A break. 

Two criteria were used to select background galaxies. 
Firstly, we required z m - ln < z p \ mt < z max , where we de- 
fine z min = z c i us + 0.1 and z max = 2 by default. (For the 
special cases of A2537 and A2667 discussed above, we 
conservatively take z m - ln — 0.55 and z mm = 0.50, respec- 
tively.) Secondly, we elimated sources with a significant 
low-redshift solution by requiring that the probability 
that z > z c \ us + 0.1, determined by integrating P(z), is 
> 90%. Adopting a higher threshold generally had lit- 
tle effect on the resulting shear profiles, but reduced the 
surface density of selected sources. A2667 showed the 
greatest possibility of residual dilution, consistent with 
the more limited photometry described above, but we 
show in Section [8] that the shear profile is consistent with 
the strong lensing and X-ray mass measurements where 
they overlap. 

Dilution of the shear signal from cluster or foreground 
sources is probably the main systematic error in clus- 
ter weak lensing analyses. Therefore, we conducted sev- 
eral astrophysical tests to assess the reliability of our 
background galaxy identification. These are illustrated 
in Figure [4] for the case of A611. Firstly, we looked for 



an angular clustering signal between galaxies identified 
as in the cluster or the foreground, and those in sev- 
eral bins of higher redshift. The cross-correlation signal 
(left panel) is low or absent at z < 2, while the auto- 
correlation in the foreground bin is prominent. If we ad- 
mit sources with z p hot ^ 2, a significant clustering with 
low-redshift sources arises from confusion between the 
photometrically-inferred B aimer and Lyman breaks; this 
motivates our choice of z max — 2. Secondly, we examined 
the radial shear profile, which shows a well-defined rise 
when using our selected background sample and a flat, 
low signal when using the remainder of sources (middle 
panel), as expected if they are mostly unlensed. Finally, 
we investigated the surface density of sources as a func- 
tion of cluster-centric radius (right panel). The density 
of cluster galaxies rises rapidly towards the center, while 
that of background sources is flat or declines. These tests 
give us confidence that the photometric redshifts are ef- 
fective at isolating lensed sources. 

In our shear analysis we incorporate the individual 
£phot measurements of the background sources. How- 
ever, as a check of our z p hot distribution, we computed 
the mean distance ratio (Di s /D s ) that determines the 
lensing efficiency (Table [2| . We then selected galaxies 
from the COSMOS survey with a matching magnitude 
distribution in the detection band and with similar z m - m 
and z max cutsrl The ( Di s / D s ) determed from the 30- 
band z phot in COSMOS filbert et al]|2009| ) agreed with 
our determinations with a scatter of only 3%, suggest- 
ing that errors in the mean distance to the background 
sources have a minimal effect on our analysis. 

3.4. Results 

The mean distortion of background galaxies is a mea- 
sure of the reduced shear g = 7/(1 — ft), where 7 and 



k are the shear and convergence (e.g., Schneider]|2006[ ) 
Figure [5] displays the azimuthally-averaged tangential re 
duced shear for all 7 clusters. In general we select galax- 
ies with 100 kpc < R < 3 Mpc for the shear analysis. 
At smaller radii there are few sources and contamina- 
tion from cluster galaxies is most severe, while the outer 
limit corresponds roughly to the SuprimeCam field of 
view. In A2667 and A2537, where our photometry is 
less extensive, we require R > 150 kpc to account for 
the greater possibility of dilution at small radii. In all 
clusters, a smoothly rising tangential shear profile is ob- 
served, with no clear evidence for dilution from contam- 
inating foreground sources. A significant B-mode signal, 
which should not arise physically and is thus often used 
a diagnostic of systematic errors, is not detected. 

For each cluster we also produce d two-dimensional 
(2D) surface density maps following jKaiser fc Squires 
(1993), which are shown in Figure p| lb increase the 
surface density of sources, we loo sene d the P(z) selec- 
tion criterion described in Section [3.2| this has no effect 
on our quantitative results, which do not rely on the 
2D maps. In general mass and light are well-aligned, 
and any other structures in the fields are detected at 
marginal significance. (This can be seen by noting that 

9 In detail, we increased z m i n by 0.1 to account for the effects of 
our P(z) cut that could not be directly mimiced in COSMOS. The 
COSMOS broadband photometry was linearly interpolated to the 
central wavelength of our detection band when necessary. 
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Fig. 4. — Tests of our background galaxy selection for the weak lensing analysis, demonstrated in A611. Left: Angular correlation 
between galaxies with z p hot securely below the selection threshold z m ; n = 0.39, and those in other bins of 2 () -, U)l . The auto-correlation at 
z < z m i n and the lack of any cross-correlation with other z m i n < z p hot < 2 galaxies indicate low contamination. Middle: Tangential shear 
profile measured using selected background sources (red), which shows the expected rise toward the center, and using excluded foreground 
and cluster galaxies (blue), which shows a flat, low signal. Right: The surface density profile of galaxies with secure z p hot near the cluster 
redshift (red) shows the expected rise, while the density of selected background sources (blue) is flat or declining toward the center. 
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the dashed contours show the mass reconstructed using 
the -B-mode signal: all such peaks are spurious and give 
an indication of the number of noise peaks of a given 
significance expected in this field of view.) These mass 
maps are useful for investigating the upturn or plateau 
in the radially-averaged shear signal seen at large radii in 
A383 and A2667. The upturn in A383 is likely related to 
substructures near the virial radius, and following N10, 
we therefore restrict to R < 1.5 Mpc in this cluster. 

In A2667, the radial shear profile shows a high plateau 
to R > 3 Mpc, which is explained in the mass map 
by a second large mass concentration clearly detected 
6.2' = 1.4 Mpc north of the main, strong-lensing cluster. 
The secondary clump detected in the lensing map is ex- 
actly aligned with an excess of bright red galaxies near 
the cluster redshift (Figure |6| . The brightest of these 
galaxies has a redsh ift z — 0042 from the 2dF survey 
(Colless et al. 2001), corresponding to a comoving dis- 



tance of 100 Mpc along the line of sight. This suggests 
the second clump is slightly in the foreground of A2667. 
In our weak lensing study we model both mass concen- 
trations simultaneously, and results for the main cluster 
are independent of the redshift of the second peak. 

4. STRONG LENSING 

We now turn to the identification of sources multiply 
imaged by the clusters. Every cluster in our sample has 
been imaged by HST, and every one except A2537 has 
been the subject of an earlier lensing study, as described 
below. We refer to and build upon these models. In 
Sections 4.1 through 4.7 we consider each cluster indi- 
vidually, and in Section 4.8 we describe the construction 
of catalogs of cluster galaxies relevant as perturbers in 
our strong lens models. 

The positions of the multiple images are illustrated in 
Figure [7] and tabulated in the Appendix. We have re- 
tained the nomenclature of various authors; however, in 
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all cases the final number or letter distinguishes multi- 
ple images of the same source. In several cases we have 
added new spectroscopi c red shifts based on the observa- 
tions detailed in Section [6. 1| These spectra are shown in 
Figure [8} 

4.1. MS2137 

This famous cluster presents tangen tial and radial arc s 
at z = 1.501 and 1.502, respectively ( |Sand et al.]|2002| ). 
We incorporate t wo additional images to the model of 
Sand et al. (2008): a fourth counter-image 3d to system 
3, and the mirror image (2c) of the radial arc. The latter 
was not included in our previous analyses due to the 
difficulty of securing a clear identification in the light 
from the BCG, but the counter-image is clear in recent, 
deeper imaging from the CLASH survey ( |Postman et al. 
2012a|). 



4.2. A963 

A set of merging images fo rms the "northern arc" at 
z = 0.771 (Ellis et al. 1991). Since conjugate points 
could not be clearly identihed, we incorporate this arc as 
constraint on the pos ition of the critical line, following 
Richard et al. (2010 R10), which is assumed to pass 
through the arc. 

4.3. A383 

The mode l foll ows Nil, which bu ilt upon [Sand et al. 



( |2004[|2008| ) and |Smith et a l. 
z = 6lt)2 7 images (system 5" 



(20051). We add the pair of 



ater identified spectroscop- 



ically by Richard et al. (20111, along with minor shifts 
to other image positions made based on deeper imaging 
from CLASH. The radial and tangential arc system at 
z = 1 .01 (systems 1 and 2, Smith et al.||2001 Sand et al. 
20041 and a complex system with a redshift z = 2.55 



(system 3, Nil) strongly constrain the mass model. Sub- 
sequent near-infrared observations confirmed the latter 
redshift via Ha and [O III] emission lines and provided 
a more precise value z = 2.58 (Belli et al., in prepara- 
tion). We have not included system 6 as a constraint 
due to its peculiar and unexpected sy mmetry (see dis- 
cussion by Morandi & Limousin 20121, but do report a 
spectroscopic redshift z = 1.826. 

4.4. A611 

We adopt the model of N09, comprising a five-image 
system with an originally-reported redshift of z = 2.06 
(system 1), a giant tangential arc at z = 0.908 (system 2), 
and a four-image system with no spectroscopic redshift 
(system 3). These redshifts were published in R10. A 
subsequent near-infrared spectrum of system 1 revealed 
an unambiguous redshift of z = 1.49 via Ha, H/3, and 
[O III] emission (Belli et al., in preparation). This shows 
that the redshift z = 2.06 in R10 resulted from a misiden- 
tification of the single rest-UV emission line C III] A1909 
as C IV A1549. We ret urn to the impact of this on mass 
models in Section |8.2| Additionally, the counter-image 
of the faint Lya emitter identified in R10, whose posi- 
tion was suggested by the original lens model, is a less 
likely identification in models based on the new redshift. 
Thus, we do not include this system as a constraint. We 
located probable central counterimages of systems 1 and 
3 well within the BCG light (see N09, Figure 6) based on 



predictions of the lens models. Although we have con- 
servatively not imposed their positions as constraints, we 
verified that including the central image of system 1 (the 
more reliable identification) would not significantly influ- 
ence our results. 

4.5. A2537 

This cluster displays many spectacular arcs that have 
so far not been modeled in the literature. We iden- 
tify four systems with new spectroscopic redshifts of 
z = 1.970,2.786, and 3.607 (Figures [?[ and [|) . Several 
conjugate images were initially identified on the basis 
of similar morphology to construct a preliminary lens 
model, which was iteratively refined to locate the posi- 
tions of the other images. Image systems 1 and 2 arc 
located within a 3-fold "naked cusp" arc at z — 2.786. 
Systems 3 and 4 form 5-fold images at z — 1.970 and 
z = 3.607, respectively, both containing central images 
within the radial critical line. We discuss the inclusion of 
galaxy PI as a perturber in our lens model in Section [7j 

4.6. A2667 



Our model is based on that of Covone et al. (2006 
C06). It consists of a n extremely bright giariTtangen- 
tial arc at z — 1.034 (Sand et al. 20051 and two sys- 



tems with no spectroscopic redshifts named B and D in 
C06 (3 and 4, respectively, in our nomenclature). Based 
on interim lens modeling, we identified two additional 
counter-images 4.3 and 4.4 shown in Figure[7j The giant 
arc is incorporated via two features (systems 1 and 2) 
located as flux maxima and minima. 

4.7. A2390 



The lens model is based on those presented in |Jullo| 
(2008) and R10. It contains tw o arcs at z = 4.05, the H3 
and H5 systems of P cllo et al. (1999). (For reasons dis- 
cussed in Section [7.2| we do not include all the detectable 
conjugate points within these arcs as constraints.) The 
41a/b system was previously identified on the basis of 
clear mirror symmetry but has no spectroscopic redshift. 
We secured a new spectroscopic redshift z — 0.535 for 
the 51a/b system near the cluster center, as well as a 
redshift z — 1.036 for the giant red arc (system B) to the 
southeast of the BCG based on very weak [O II] emis- 
sion (Figure^]). Two conjugate points in the red arc were 
identified as flux minima in an HST /WFC3-IR F125W 
image (proposal ID 11678). The lens model predicts a 
counter-image to the northeast of the BCG, which we 
locate but do not include as a constraint due to uncer- 
tainty in its precise position (it appears to be superposed 
on a singly- imaged portion of the galaxy). 

4.8. Cluster galaxy identification 

Strong lens models must account for mass in cluster 
galaxies, which perturb the positions of critical lines lo- 
cally. We initially identified likely cluster galaxies as 
those with photometric redshifts near that of the clus- 
ter (|Az| < 0.15). In A2537 and A2667, for which only 
two colors are available, we instead identified the locus 
of the cluster in the color-color plane. Absolute magni- 
tudes in the r b and were estimated a nd compared to 
Mr* = —21.38 (Rudnick et al. 20091, appropriate to 



cluster galaxies at the redshitts ot our sample. Only 
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Fig. 7. — HST images of the central cluster cores, with multiply-imaged sources identified (circles). Where possible we show color 
composite images, using data from the sources in Table \3\ or from the CLASH survey (A611, MS2137, A383). Reconstructed image 
positions based on the models described in Section [9] are indicated by crosses (colors vary for clarity); critical lines are also overlaid at the 
redshifts zcl indicated in each panel. Individually-optimized perturbing galaxies are denoted PI, P2, etc. 
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Fig. 7. — Continued 

galaxies brigher than 0.1L r ^ were considered, unless they 
fell close to a multiple image. Early-type galaxies with 
L ~ 0.1L* have a sa 90 km s _1 using the scaling relations 
we introduce in Section [7| which corresponds to deflec- 
tion angle of ~ 0'.'15 in the singular isothermal sphere 
approximation, well within the uncertainty of cr pos = 0'.'5 
that we assign to the image positions. The radial ex- 
tent of the sample was limited to extend safely beyond 
the strong lensing zone. This catalog was manually re- 
fined in some cases. Although initially based on our 
multi-color ground-based catalogs, the parameters of the 
galaxies (center, ellipticity, P. A., flux) were refined using 
the HST imaging. The final catalogs contain ~ 10 — 60 
galaxies, varying with the richness of the cluster and the 
extent of the strong lensing zone. 

5. BCG PHOTOMETRY 

In order to model the distribution of stellar mass in the 
BCG and to interpret our kinematic observations, the lu- 
minosity profile of the galaxy must be known. Further- 
more, we wish to relate the stellar mass-to-light ratios 
derived in our models to estimates from stellar popula- 
tion synthesis, particularly in Paper II. In this section, we 
present fits to the surface brightness profiles and broad- 
band colors of the BCGs. 

5.1. Surface brightness profiles 

Interpreting stellar dynamics in the BCG requires a 
model for the distributions of luminous tracers and mass. 
The dPIE parametrizatiorj^l is particularly appropriate, 
since it is analytically convenient, widely used in lensing 
studies, and provides good fits to observed galaxies. It 
is characterized by two scale radii r COTe and r cu t , and the 
three-dimensional density is defined by 



PdPm(r) 



Po 



(l + r7r c 2 oro )(l + rVr c 2 ut )' 



(1) 



The analytic properties of the profil e and the intro- 
duction of ellipticity are discussed by |Ehasdottir et al. 



(20071. The spherical radius enclosing half of the light 
r cut , while the projected effective radius is 
in the limit r corc /r cut <C 1. We fit dPIE 



Th 



4 ^cut 



profiles to the BCGs in our sample using HST imag- 
ing obtained in reduced form from the Hubble Legacy 
Archive, selecting observations around 6000 A, which is 
close in wavelength to the absorption features used to 
derive kinematics (Section [6]) . In A2390 we opted to use 
a F850LP observation instead, due to a prominent cen- 
tral dust feature, although this had little effect (~ 8%) 
on the derived radius. The filters and instruments used 
are listed in Table [3] 

The background level in the HST images was adjusted 
based on blank sky regions far from the BCG. A noise 
map was constructed based on the background and shot 
noise from the BCG. Light from other galaxies in the 
field was carefully excluded using large elliptical masks 
generated from SExtractor parameters and then manu- 
ally tuned. The geometric parameters of ellipticity, po- 
sition angle (P.A.), and center were first determin ed by 
fitting an R 1 / 4 profile to the 2D data using Galf it (Peng 



et al. 2002). We then extracted elliptical isophotes and 
fit the ID surface brightness profile in the inner 20" to 
a dPIE model using a custom code, accounting for the 
HST PSF. MS2137 and A383 present gradients in P.A., 
and the BCG geometry contributes to the modeling of 
their radial arcs. In these clusters, we thus fixed the 
PA. to that measured near these arcs. 

Figure [9] demonstrates that this procedure produces 
goods fits to the data, particularly within the radial range 
most critical for the dynamical modeling (solid lines). In 
the inner 10", rms residuals are typically 5%. At larger 
radii, some BCGs have a cD-type upturn in their surface 
brightness profile th at is not well-fit with a single com- 
ponent model (e.g., |Gonzalez et al.||2005|). This causes 



10 Also referred to as a PIEMD, or pseudo-isothermal elliptical 
mass distribution. 



errors in the total luminosity and radii, but these are cor 
related such that the surface luminosity density within 
~ 10" is well fixed. This all that is necessary for our dy- 
namical and lens models, given that the kinematic data 
are confined to R < 5" in all but one case (A383), and the 
mass budget is always dark matter-dominated beyond a 
few arcseconds. 

Varying the background level produced 5 — 10% sys- 
tematic variations in r cut . Five clusters we additionally 
fit a redder band (F850LP, F125W, or F160W) in ACS 
or WFC3 imaging to investigate trends with color. In 3 
cases the derived radii agree to < 7%, within the system- 
atic errors, while in the remaining pair (A611 and A383) 
the radii are ~ 20% smaller in the redder band. Even in 
these cases, the color gradients are minimal (< 0.1 mag) 
within R < 7", so the differences mainly reflect gradi- 
ents beyond ~ R e . While the redder data likely better 
trace the stellar mass, the dynamics are dark matter- 
dominated at these large radii. We therefore considered 
it more important to accurately model the tracers and 
adopted the measurement s at ~ 6000 A. This choice is 
justified further in Section [9~3l 

5.2. Stellar population synthesis 

We additionally fitted stellar population synthesis 
(SPS) models to the BCG colors. Since the BCG is of- 
ten saturated in our Subaru imaging, we also rely on 
photometry from the SDSS or HST imaging. The SDSS 
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Fig. 8. — Spectra of multiply-imaged sources obtained in new observations described in Section |6.1| The axis at the top of each plot 
indicates the rest-frame wavelength. Selected lines are identified, and areas of residual sky emission or absorption are hatched. The spectra 
are not flux calibrated, and the flux units are arbitrary. Multiple features are identified in each spectrum, resulting in a unique redshift 
determination with the exception of A2390 B. The identification of the single weak emission line in the latter case as [O II] is supported by 
photometric redshift estimates of this red arc. 



TABLE 3 

HST SURFACE PHOTOMETRY OF BCGS 



Cluster 


Instrument /Filter 


Tout (kpc) 


dPIE fit parameters 
r C orc (kpc) b/a 


PA. 


Mag. 


Ly 
(lO 11 /^) 


Proposal 
ID 


MS2f37 


ACS/F625W 


18.7 ± 2.6 


1.4 


0.89 


75 


17.31 


3.20 


12102 


A963 


WFPC2/F702W 


35.6 ±4.6 


0.47 


0.81 


6.4 


15.41 


4.61 


8249 


A383 


ACS/F606W 


38.2 ± 3.0 


1.2 


0.89 


8.7 


15.81 


4.06 


12065 


A611 


ACS/F606W 


46.2 ± 3.4 


1.2 


0.73 


42.3 


16.81 


5.47 


9270 


A2537 


ACS/F606W 


52.7 ±6.5 


0.75 


0.74 


-58.5 


16.90 


5.86 


9270 


A2667 


WFPC2/F606W 


68.8 ±10.6 


0.26 


0.69 


40.4 


16.33 


3.89 


8882 


A2390 


ACS/F850LP 


24.4 ±2.9 


0.44 


0.73 


-50.6 


15.79 


2.92 


10504 



Note. — Uncertainties in r cu t include random and systematic errors assessed by varying the background. Errors in r core , b/a and 
P.A. (measured in degrees east of north) are negligible for our analysis. Circularized radii are reported. The rest-frame Lv is corrected 
for Galactic extinction; the observed magnitude is not. The uncertainty in the observed magnitude and in Ly assuming a dPIE model is 
~ 0.1 mag. 
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Fig. 9. — Surface brightness profiles of BCGs, measured in HST 
imaging through the filters indicated in Table [3] Data are shown 
as diamonds, with formal errors usually smaller than the symbol 
size. These are vertically offset as shown in the caption for clarity. 
dPIE fits are drawn as solid lines throughout the radial interval 
most relevant for dynamical modeling and dotted outside. The 
critical interval is estimated approximately as where the surface 
brightness exceeds 10% of that at the outer limit of the kinematic 
data (indicated by top arrows). 



TABLE 4 

Stellar Population Synthesis Fits to BCGs 



Cluster 




JVfilt 


Photometry source 


MS2137 


2.05 


10 


HST ACS & WFC3 (CLASH) 


A963 


2.31 


4 


SDSS DR.8 


A383 


2.26 


7 


HST ACS & WFC3 (CLASH) 


A611 


2.24 


5 


Subaru & HST WFC3-IR 


A2537 


2.32 


•1 


SDSS DR8 


A2667 


2.04 


5 


HST WFPC2, ACS, NICMOS 








(proposal IDs 8882 & 10504) 


A2390 


1.80 


5 


SDSS DR8 



Note. — Stellar mass-to-light ratios T^y S are derived from SPS 
fits assuming a Chabrier IMF. Nfn t denotes the number of filters 
used in the fit. The luminosities Lv are given in Table [3] and 
include any internal dust extinction. 



colors are based on model magnitudes, while colors in 
HST imaging are based on apertures with radii ~ 2'.'5 
that avoid other galaxies, local dust features, and arcs. 
(This aperture corresponds to roughly the radial extent 
over which the stellar mass d ominates.) The kcorrect 
code ( Bla nton fc Roweis|2007 ) was used to fit SPS mod- 
els from which a fe-correction to the rest-frame y-band 
luminosity Ly was computed (Table [4j. The luminosity 
was scaled to match total flux of the dPIE model and 
corrected for Galactic extinction. We assigned errors of 
10% to all photometric measurements in the fitting pro- 
cess - much larger than the random errors - to account 
for systematic errors in the photometry and models. 

These SPS models fits also provide an estimate of the 
stella r mass-to-light r atio T^ v s = M*/Lv appropriate 



for a |Chabrier| ( |2003[ ) initial mass function (IMF). Fol 
lowing standard practice, the stellar masses refer to the 
current mass in stars and do not include any gas lost 
during stellar evolution (e.g., Treu et al. 2010 Cappel- 



lari et al.|2012 ) . The photometric data and derived 



SPS 
V 



ratios are listed in Table |Zj Overall the T^y S estimates 
are quite uniform, with an rms scatter of only 9%. Re- 
assuringly, the BCGs with the lowest estimates (A2667, 
A2390, MS2137) are those that show the strongest emis- 
sion lines (Section |6| and the most prominent cooling 
cores. The far-infrared emission detected by Herschel in 
A2390 and A2667 also indicat es that these syste ms host 
some ongoing star formation ( Rawle et al.||2012 ). 

By perturbing the photometric measurements by their 
errors, we estimate the typical random uncertainty in 
T^y S is about 0.07 dex. Systematic uncertainties were 
estimated by comparing measurements de rived from a 
variet y of codes. 
21 i09| 



Firstly, we use d FAST (Kriek et al 
to construct grids of both |Bruzual & Chariot 



BC03) and Chariot & Bruzual (2007, CB07) mod- 
els with exponentially-declining star-formation histories. 
The range of parameters was restricted appropriately for 
massive ellipticals: ages t with 9.5 < logt/yr < 10, star- 
formation timescales r with 8 < logr/yr < 9.5, dust 
attenuation with < Ay < 0.5 mag, and solar mctallic- 
ity Mean stellar masses were estimated by marginaliz- 
ing over the likelihood surface. (Simply taking the best- 
fitting model elevated logM* by ~ 0.05 dex on average.) 
Secondly, for A963 and A611 we are able to com pare to 
the MPA/JHU ca talog of SDSS galaxies (DR7; |Kauff-] 



maun et al.||2003[ ). Finally, in addition to the above 
comparisons involving our BCG sample, we also used 
kcorrect to fit massive ellipticals at 0.15 < z < 0.35 
with 4-band photometry observed in the SLACS sur- 
ve y. The resulti n g stel lar masses were compared to those 



of Auger et al 
constructed priors 



(|2009l, which were based on carefully- 
In all of the above comparisons, we 
find systematic mean offsets of < 0.06 dex compared to 
the masses derived using kcorrect. This level of un- 
certainty is typical given the current state of SPS. We 
conclude that our stellar mass scale is close to that of 
other authors who use similar data. 

6. BCG KINEMATICS 

Here we present long-slit spectroscopic observations 
of the BCGs in our sample and the spatially-resolved 
stellar kinematics derived from them. As we demon- 
strate below, the data are of sufficient quality to mea- 
sure stellar velocity dispersions to typical radial limits 
of rj 10 — 20 kpc, while the slit width and seeing limit 
the resolution on small scales to « 3 kpc. The stellar 
kinematic data thus probe the mass distribution from 
the smallest scales, where stars dominate the mass, out 
to radii where DM is dynamically significant. In combi- 
nation with lensing, they provide a long lever arm with 
which to study the inner mass distribution. 

6.1. Observations and reduction 

We undertook spectroscopy of the BCGs using the 
Keck I & II and Magellan Clay telescopes, as recorded 
in Table [5] Total exposure times ranged from roughly 
2 to 7 hours. Five clusters were observed using the 
L ow-Resolution Im aging Spectrometer (LRIS) on Keck 
I (Oke et al. 1995) using the 600 mm -1 grism blazed 



at 4000 A in the blue arm and the 600 mm 1 grating 
blazed at 7500 A in the red arm. A2537 and A2390 were 
observed through slitmasks in order to simultaneously 
secure redshifts of multiply-imaged sources and of clus- 
ter members. The A383, A611, and A963 BCGs were 
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Fig. 10. — Spatially-resolved spectra of BCGs with fits used to measure kinematics. Grey lines show the data, and the fitted models are 
shown in blue. Each spatial bin is normalized to a median flux density of unity. The bins are then offset vertically for clarity. The top axis 
indicates the rest-frame wavelength. Grey bands denote masked pixels as described in the text. In A2390, A963, and A383 the Mg 6/Fe 
spectral region was also observed and fitted, but only the G band region is shown here. Symmetric spatial bins on either side of the BCG 
center are co-added for display purposes where possible, although fits were performed separately. (This was not done in A2390 due to its 
low-level rotation.) Spectra have been lightly smoothed with a 2 A boxcar. 
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observed using a long slit. In A383, we additionally ob- 
served a slitmask designed to cover gravitational arcs 
The A611 and A383 observations were first presented in 
N09 and Nil, respectively. MS2137 was o bserved using 



the E chelle Spectrograph and Imager (ESI; Sheinis et al. 
2002 1) on th e Keck II telescope, as presented by Sand 



eTaT (20021. Finally, A2667 was observed using LDSS-3 



at the Magellan Observatory. In all but one case, the slit 
(Table[5| was aligned close to the major axis of the BCG, 
with some minor deviations tolerated to include gravita- 
tional arcs. For MS2137 the slit was instead aligned along 
the radial arc near the minor axis, although its isophotes 
are nearly circular. 

The long-slit spectra were reduced with IRAF using 
standard techniques for bias subtraction, flat fielding, 
wavelength calibration, trace rectification, a nd sky sub- 
tractio n as previously discussed in N09 and |Sand et al.| 
(2004). For this work we have re-reduced the order of the 
ESI spectrum containing the G band in MS2137 using 
similar methods. Multi -slit data were reduced using the 
software developed by Kelson (2003). The wavelength- 
dependent instrumental resolution was measured via un- 
blended sky lines or arc lamps and fitted with a low-order 
polynomial. The typical resolutions of the blue and red 
LRIS spectra are a = 159 and 115 km s" 1 , respectively, 
while the ESI and LDSS-3 observations have resolutions 
of a = 32 and 84 km s -1 . These are much smaller than 
the velocity dispersions encountered in BCGs, so the un- 
certainties of a few km s _1 in resolution have a negligible 
~ 1% effect on the derived dispersions. 

The center of the BCG was shifted to the center of 
a pixel during the reduction processes so that spatially- 
binned spectra could be extracted symmetrically on ei- 
ther side of the center. Our analysis focuses on two spec- 
tral regions with strong absorption features appropriate 
for kinematic study: the G band at A4308 and the Mg I 
b region containing Fe A5270, Fe A5335 and other weaker 
lines. For the LRIS observations, the spatial bins were 
determined by adding CCD rows until a minimum signal- 
to-noise ratio (S/N) of 20 A -1 was reached in the Mg 
b/Fe spectral region of the LRIS-R spectrum, suitable 
for reliable kinematic measurements. A minimum num- 
ber of rows comparable to the seeing element was also 
required. In some cases the outermost bin constructed 
by this scheme was conservatively excluded due to con- 
tamination of the key absorption features by sky resid- 
uals. Bins likely contaminated by flux from interloping 
galaxies were also excluded; this includes the innermost 
bin in A383. 

When possible (A963, A2390, A383) identical spatial 
bins were extracted in the spectral region around the G 
band in the LRIS-B spectrum, which was facilitated by 
the equal pixel scale of the detectors. Although the for- 
mal S/N is lower at the G band, we found these spectra 
could nonetheless be reliably followed due to the cleaner 
sky. For A2537 and A611, the LRIS-B spectra were not 
used owing to the coincidence of the G band with the O I 
A5577 sky line and the dichroic transition, respectively. 
For the ESI spectrum of MS2137, we considered only the 
order containing the G band, since the Mg 6/Fe region 
was strongly affected by atmospheric absorption. For the 
LDSS-3 spectrum of A2667, we extracted the rest-frame 
4000-5280 A interval, which was covered continuously. 



Figure [10] shows the extracted spectra. 

6.2. Kinematic measurement technique 

In each spatial bin, the velocity and velocity dispersion 
were measured by direct fitting of Gaussian-broad ened, 
redshifted stellar spectra using the pPXF software (Cap- 
pellari fe Emsellem|2004 1 , accounting for the instrumen- 
tal resolution. An additive continuum polynomial was 
included in the fit, with the order determined identifying 
that beyond which the fit quality in the highest-S/N bin 
did not improve significantly. The derived velocity dis- 
persions were insensitive to reasonable choices of the con- 
tinuum order to a precision of ~ 1 — 3%. For the spectra 
that were not flux calibrated (A2667 and A611), a first- 
or second-order multiplicative polynomial was allowed to 
modulate the spectral shape. For flux-calibrated spectra 
this yielded no improvement in the fit, and the addi- 
tional freedom was therefore excluded. Emission lines, 
regions of prominent sky subtraction residuals or absorp- 
tion, and remaining defects were masked. Random un- 
certainties were assessed by shuffling the residuals in 5 
pixel chunks, which maintains their correlation proper- 
ties, adding these to the best-fitting model, and re-fitting 
the resulting spectra many times. This generally pro- 
duced la error estimates only slightly larger than those 
derived from the \ 2 surface. 

The stellar templates used to fit the BCG spectra were 
constructed from the MILES library ( Sanchez-Blazquez 
et al. 2006). By default we allowed pPXF to build an 
optimal template from a linear combination 203 MILES 
stars with spectral types G5-K5 and luminosity classes 
III and IV, appropriate for old stellar populations. The 
template was determined using the spatially-integrated 
spectrum and was then used to fit each spatial bin. Mg I 
b, which is enhanced in massive galaxies, was masked 
since it gener ally produced bias ed results, consistent with 
other studies ( |Barth et al.|2002 ' ). The resulting templates 
produce excellent fits to the BCG spectra, as shown in 
Figure (TQ) 

We experimented with including a wider range of stel- 
lar templates, including all non-peculiar stars of spectral 
types A-K in the MILES library and a subset that ex- 
cludes those with low metallicity. For the A2390 and 
A2667 BCGs, some A- and F-type stars were preferred, 
consistent with the likely star-formation activity dis- 
cussed in Section [5]2 Our inclusion of these earlier spec- 
tral types impacts the derived dispersions in these sys- 
tems by < 5%. We also cons tructed templates based 
on the Indo-US coude library fValdes et al.||2004|). Fi- 



nally, we experimented with templates optimized to each 
bin, rather than constructing a single template based on 
the integrated light; this led to no noticeable system- 
atic changes. Details of the template construction led to 
systematic changes in the derived velocity dispersions at 
the 3 — 5% level. Based on our estimates of uncertainties 
related to the template and the continuum polynomial 
order, we assign a systematic uncertainty of 5% to all 
velocity dispersions, consistent with previous studies. 

6.3. Velocity dispersion profiles 

We detected no significant rotation in all but one BCG. 
In A2390, the measured rotation of 44±13 km s _1 is neg- 
ligible compared to the central velocity dispersion, with 
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TABLE 5 

Spectroscopic observations 



Cluster 



Instrument 



Date 



Exposure (ks) P.A. (deg) Seeing (") Slit width (") 



Mode 



MS2137 

A963 

A383 

A611 

A2537 

A2667 

A2390 



Keck/ESI 
Keck/LRIS 
Keck/LRIS 
Keck/LRIS 
Keck/LRIS 
Magellan/LDSS-3 
Keck/LRIS 



2001 Jul. 28 
2012 Apr. 18 
2009 Oct. 12-14 

2008 Mar. 1 
2009 Oct. 12-14 
2007 Jul. 15, 17 
2009 Oct. 12-14 



6.7 

7.8 
23.7 

7.8 
14.4 
19.8 
14.4 





-15.5 

2 

45 
125 
27.4 
-45 



0.8 
2.5 
0.7 
1,1 
0.8 
0.9 
0.8 



1.25 Cross-dispersed 

1.5 Long-slit 

1.5 Long- & multi-slit 
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Fig. 11. — Resolved stellar velocity dispersion profiles, with cross and diamond symbols denoting independent measurements on either 
side of the BCG center and colors denoting measurements in the spectral regions indicated in the caption. Radii are measured along the 
slit. Points with error bars show the weighted mean measurements, with errors including a systematic estimate as described in the text. 
The final panel combines these measurements for the full sample. 



(v/a) 2 = 0.026. In the remainder of our analysis, we thus 
focus only the velocity dispersions. When multiple mea- 
surements of the dispersion in the same radial bin were 
available, either from fits on either side of the BCG cen- 
ter or in different spectral regions, they were combined 
with a weighted mean to produce a more precise esti- 
mate. This is justified given that the agreement between 
independent measurements is very good overall: of the 
87 pairs of overlapping measurements, 79% agree within 
la using the random error estimates only. In a few bins 
the spread among estimates appeared greater than could 
likely be explained by random errors only, and in these 
cases the error bars were inflated based on the spread in 
estimates. In all cases, 5% was added in quadrature to 
the final uncertainty to account for the systematic effects 
described in Section [6~2l 

The derived veloc ity dispersion profiles for each cluster 
are shown in Figure 1 1 including the weighted mean esti- 
mate and the individual measurements described above. 
The data are listed in Table [6l In all 7 clusters, the ve- 
locity dispersion rises with radius. This contrasts strik- 
ingly with massive field ellipticals, which show velocity 
dispersion profiles that are flat or slowly declining (e.g., 
Carollo et al. 1995 Gerhard et al. 2001: Pad manabhan| 
et al.|2004p . Our data imply a strongly rising total mass- 
to-light ratio, which we will show in Section [lO] can be 
naturally explained by the cluster-scale halo. An alter- 



native explanation for the rising dispersions is that the 
stellar orbits rapidly become more tangential at large 
radii. This can be tested using the detailed shape of 
stellar absorption lines in nearby systems, which would 
reveal "peakier" profiles at large radii if circular orbits 
dominate. Observations of local cD galaxi es instead fa- 
vor nearly isotropic or mildly radial orbits dCarter et al. 
19991 JKronawitter et aLl[2000| |Saglia et al.||2000| |Hau 
et al. 2004p , which indicates that the rising dispersions 
are not an artifact of the orbital distribution but reflect 
the genuine dynamical influence of the cluster potential. 

6.4. Comparison to previous work 
We have reanalyzed the spectra of A61 1 and MS2137 



presented in N09 and Sand et al. (20021, respectively, 
an d obtained a new , deeper spectrum ot A963 compared 
to|Sand et al.l (I2004I) . The A383 spectrum and kinematic 



measurements are identical to Nil, with the exception 
of a small adjustment (< la) to the outermost bin only. 
However, the velocity dispersion measurements in A611, 
MS2137, and A963 have changed systematically and sig- 
nificantly compared to t he previously publ i shed values 



While the earlier works (Sand et al. 2002 2004 2008 



N09) indicated a flat or even declining (in the case o 
MS2137) dispersion profile in these clusters, we find a 
rising trend in common with the rest of the sample. 
Given that multiple codes and techniques were used to 
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TABLE 6 

Velocity dispersion profiles 



Cluster 
MS2137 

A963 

A383 



Radial bin (arcsec) <r (km s 1 ) Cluster Radial bin (arcsec) a (km s 



A611 



- 


0.22 


292 ± 22 


A2537 


- 


0.41 


284 ± 14 


0.22 


-0.65 


311 ± 21 




0.41 


- 1.22 


315 ± 19 


0.65 


- 1.08 


319 ± 27 




1.22 


- 2.03 


328 ± 20 


1.08 


- 2.09 


360 ± 36 




2.03 


- 2.84 


360 ± 22 


- 


0.68 


313 ± 17 




2.84 


- 3.65 


385 ± 43 


0.68 


- 2.03 


336 ± 18 


A2667 


- 


0.47 


228 ± 19 


2.03 


- 3.38 


369 ± 24 




0.47 


- 1.42 


243 ± 16 


3.38 


-4.73 


413 ± 36 




1.42 


- 2.36 


279 ± 28 


0.41 


- 1.22 


272 ± 15 


A2390 


- 


0.41 


266 ± 17 


1.22 


- 2.03 


281 ± 16 




0.41 


- 1.22 


291 ± 19 


2.03 


-2.84 


304 ± 17 




1.22 


- 2.03 


331 ± 23 


2.84 


- 3.65 


326 ± 19 




2.03 


- 2.84 


356 ± 25 


3.65 


-4.46 


323 ± 20 




2.84 


- 3.65 


374 ± 32 


4.46 


- 5.27 


373 ± 31 




3.65 


- 4.46 


420 ± 43 


5.27 


- 6.08 


411 ± 37 










6.08 


- 8.78 


465 ± 41 










- 


0.55 


317 ± 20 










0.55 


- 1.65 


347 ± 20 










1.65 


- 2.75 


380 ± 25 










2.75 


- 3.85 


368 ± 27 










3.85 


- 5.61 


452 ± 45 











Note. — Line-of-sight velocity dispersions are derived from averaging observations on 
either side of the BCG center and, in most cases, in multiple wavelength intervals, as de- 
scribed in Section |6.1| Radii are measured along the slit, which is oriented near the major 
axis with the exception of MS2137; they can be circularized using the axis ratios in Table|3] 
Error bars include a 5% systematic component added in quadrature. 



reduce the pres ent data, yielding very similar dispersion 
profiles (Figure 11 final panel), the differences in these 
measurements appear unrelated to the data reduction it- 
self. More likely they arise from improvements to the 
velocity dispersion measurement procedure. In particu- 
lar, we now (and in Nil) rely on large libraries of high- 
quality stellar spectra to construct templates, whereas 
earlier works were restricted to a relatively small num- 
ber of stars observed with ESI. Furthermore, we now 
construct composite templates from linear combinations 
of these spectra, rather than taking a sin gle star. This 
provides much higher-quality fits (Figure 10 ) with vir- 
tually no residual "template mismatch." We have also 
tested the dispersion measurements in MS2137 using an 
independent code developed by M. Auger and find identi- 
cal results (A. Sonncnfcld, private communication). The 
earlier suboptimal templates used in earlier works prob- 
ably led to biases at higher a or lower S/N. Given the 
high quality of the data (th e rising a can be seen by eye 
in many panels of Figure 10 1, the improved methodology, 



ular, they allow for deviations in the inner regions from 
the CDM density profiles produced in numerical simula- 
tions. As we described in Section [T] this is the region 
where the effects of baryons or non-standard DM should 
be the most pronou nced. The generalized NFW profile 
(gNFW, [Zhao]p96l ) , given by 



Pdm(0 



Ps 



{r/r s )P(l + r/r s ) 



3-s- 



(2) 



and the resulting uniformity of the dispersion profiles, we 
are confident in the present results. 

7. MODELING THE CLUSTER MASS DISTRIBUTION 

Having introduced the observational data that form 
the basis of our analysis, we now describe the models 
and methods that we use to infer the cluster mass distri- 
bution. As in N09 and Nil, our mass model consists of 
three components: the DM halo, the stars in the BCG, 
and the mass in other cluster galaxies. Each is described 
by one or more analytic models, introduced below, and 
the parameters of these models are constrained simulta- 
neously using our full data set. 

Two flexible functional forms are adopted to describe 
the dark halo. In addition to length and density scaling 
parameters, each includes a third parameter that allows 
for variation in the shape of the density profile. In partic- 



reduces to the NFW profile when /3 = 1, but the asymp- 
totic inner slope dlogpDM/^log?' = — (3 as r — > can be 
varied. When we fix f3 = 1 to fit NFW models, we refer 
to the virial mass M200 as that within a sphere of radius 
^200 that has a mean density equal to 200 times the crit- 
ical density /? cr ;t of the universe at the cluster redshift. 
The concentration is then C200 = ?"20o/ r s- 

In order to verify that our results do not strongly de- 
pend on the functional form of the density profile, we 
have introduced a second parametrization that we refer 
to as a "cored NFW" (cNFW) model: 



bp s 



{l + br/r s ){l + r/r s y 



(3) 



This is simply an NFW profile with a core introduced, 
i.e., with asympotically constant density as r — > 0. The 
scale of the core is controlled by the parameter b. A char- 
acteristic core radius can be defined as r core = r s /b\ at 
this radius, the density falls to half that of an NFW pro- 
file with equal r s and p s . As r corc — > (b — > 00) the pro- 
file approaches the NFW form. We follow the Lenstool 
convention and use the parameter <7q = ^Gp s r 2 s in place 
of p s . This is simply a defined scaling and should not be 
taken as the actual velocity dispersion. 

We considered additionally using Einasto models, 
which have been shown to provide more accurat e repre- 
sentations of halos in numerical simulations (e.g., |Merritt| 
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et al.||2006| |Navarro et al]|2010[ [Gao et al.||2012| . How- 
ever, this form is not optimal tor observational studies of 
the inner halo, because the behavior at large and small 
radii are strongly coupled: to explore flat inner profiles, 
one has to accept steep declines in the outer regions. 
By contrast, the large-radius behavior of the gNFW and 
cNFW density profiles are invariant. Further, Einasto 
profiles with the range of shape parameters seen in sim- 
ulations can be approximated by gNFW profiles within 
~ 10% over the relevant range of radii. 

The stellar mass in the BC G is mo deled with a dPIE 
profile, introduced in Section 5.l| n The center, P. A., 
ellipticity, and scale lengths r coro and r cut are fixed based 
on the fits to HST imaging described in that section. 
The only free parameter is then the stellar mass-to-light 
ratio T*v = M*/Ly, which we assume to be spatially 
invariant within t he B CG. (This assumption is discussed 
further in Section |9.3|) We parametrize T^y relative to 



the values T^y S derived from our SPS fits, based on a 



Chabrier IMF (Section 5.2 ) 



log «sps 



logT,v/T^ s 



(4) 



(Treu et al. 2010). We place a very broad uniform 
prior on logasPS> corresponding to a mass that is 
1.5x lighter than T^y S to a mass 2x heavier than the 
T^y S inferred using a Salpeter IMF, where we take 
logM^saip/M^chab = 0.25. The total allowed range in 
T*v is thus a factor of 5.3. 

The final ingredient in the mass model is the dark and 
luminous mass in non-BCG cluster galaxies, which are 
significant perturbations in the strong lensing analysis. 
The iden tification of these galaxies was described in Sec- 
tion |4.8| Their mass is modeled using dPIE profiles. The 
center, ellipticity, and P.A. are fixed to that of the light, 
and for most of the cluster galaxies, the structural pa- 
rameters are tied to scaling rel ation s specific to each clus - 
ter (e.g., |Limousin et al.|200"7| N09; |Richard et al.|2010] ): 



^*cut — ^cut,* (-^r f -^r,* ) ^ 
^"corc — ^corc,* (-^r /-^f,* ) 
(7 = (7^(L r /L r ^) 1 ^ 4 



(5) 



Following previous work (e.g., N09; |Richard et al.|2010 ), 
we place a Gaussian prior on cr* of 158 ±27 km s based 



on the obse rved scaling relations in the SDSS (Bernardi 
et al.|200 3 |. Based on the galaxy-galaxy lensing study of 
Natarajan et al. ( 2009 ), we allow r cuti * to vary from 15 to 
bU Jtpc. As those authors note, this is much larger than 
the optical radius of the galaxies, and our dPIE models 
therefore include galaxy-scale dark halos. Our analysis 
is insensitive to r corc „, which is thus fixed to 0.15 kpc. 

These scaling relations are sufficient for the majority of 
cluster galaxies. In some cases, however, the position of 
a multiple image can be strongly influenced by a nearby 
galaxy. In these situations, the galaxy is freed from the 
scaling relations and modeled individually. These galax- 
ies are indicated in Figure [7j It is sufficient to free either 
a or r cut , since their effects are de generate, and in prac- 
tice we usually fix a based on the Bernardi et al. ( 2003 1 



Note that there is no distinction between the halo of the BCG 
and that of the cluster, which would be observationally impossible 
and is not well-defined theoretically. 



results and vary r cut . We not e on e peculiar case, that of 
galaxy PI in A2537 (Section [15]). We found that indi- 
vidually optimizing this perturber improved the model- 
ing of the arc system composed of images families 1 and 
2, although PI is clearly deflected and located behind 
the cluster (z p h ot = 0.59 ± 0.04 in SDSS DR8). This 
suggests a possible interesting two-plane effect, which is 
beyond the scope of this paper to fully model. Never- 
theless, we find that the inferred mass parameters are 
consistent with an ~ galaxy, which agrees reasonably 
with the (demagnified) luminosity. 

The intracluster medium (ICM) is not modeled as a 
distinct mass component in our analysis and is there- 
fore implicitly incorporated into the halo. In the present 
paper we focus on the total density profile, so the sepa- 
ration of the DM and gas is not a concern. Based on the 
~ 3 kpc spatial resolution of our spectra, we also do not 
consider a supermassive black hole. Observations of lo- 
cal BCGs indicate this becomes dy namically significan t 
only at smaller scales < 1 kpc (e.g., Kelson et al.| [2002). 



7.1. Additional mass components 

In A2667 the weak lensing map (Figure pi) shows a 
clear second clump located ~ 1.4 Mpc n orth ofthe BCG, 
which is likely in the foreground (Section 3.4). Due to the 
large separation, this mass is unimportant tor our strong 
lensing and dynamical analysis, but it must be considered 
for weak lensing. We therefore added a second dark halo 
to the model near the position indicated in the 2D mass 
map, as listed in Table [7j Since the internal structure is 
not well-constrained by the shear data, an NFW profile 
is assumed with a broad log- normal prior on C200 The 
mean of this prior was taken to be 4, appropriate to 
the virial mass of logM2oo/ M = 14.7 inferred from the 
full modeling discussed below, although adopting an even 
broader prior did not significantly affect the results. 

We experimented with adding a second mass clump 
to the west of the BCG in A2390, based on the exten- 
sion of galaxies and X-ray emission on ~ 100 kpc scales 
discussed in Section [2j but found that this did not im- 
prove the quality of the fit to the lensing data and sub- 
stantially lowered the Bayesian evidence. We therefore 
consider a single dark clump to be sufficient. In A2537 
the curvature of the arcs suggests a possible additional 
mass clump to the north of the BCG, which is given 
further credence by the multimodal dynamical structure 
described in Section [2] We experimented with adding a 
second clump and found that it did improve the Bayesian 
evidence when only strong lensing constraints are fit, but 
not with the full data set. The inferred mass was small 
(~lx 10 13 M ), and correspondingly the most relevant 
parameters for our study (halo mass and concentration, 
inner slope, T»y) change little. Therefore, we retain a 
single dark clump when fitting this cluster also. 

7.2. Inferring mass models from data 



Our analysis is based on th e Lenstool code (Kneib 



et al. 1993 



Julio et al. 



2007), which has been widely 



used tor studying strong lenses. For this project we have 
added components to Lenstool that incorporate weak 
lensing and stellar kinematic constraints. The inference 
method is fully Bayesian. The prior distributions we 
adopted are listed in Table [7] For the key parameters 
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TABLE 7 

Prior Distributions Used in the Cluster Mass Models 



plane, with 



Parameter 



Units 



Prior 



Cluster-scale dark matter halo 

e (pseudoellipticity) 
PA. 

r s 



deg 
kpc 
km s 



P (gNFW models) 

b (cored NFW models) 

Stellar mass in BCG 
logasps 

Cluster galaxy scaling relations 
a- t km s — 1 

r C ut,* kpc 



u(...V 
u(...y 

£(50, 1000) 
L(500,3500) 
[7(0.01,1.5) 
L(l,1000) 

U(-0. 176, 0.551) 



G(158±27) 
17(15,60) 

Individually-optimized galaxies each add an additional 
parameter as discussed in the text. 



Weak-lensing shear calibration 
m W L 



G 2ct (0.89±0.05) 



Additional parameters for individual clusters 



A611 
Redshift of source 3 



A2667: second NFW clump at R ~ 1.4 Mpc 





arcsec 


G(7±45) 


a« 


arcsec 


G(370 ± 45) 


e 




17(0,0.3) 


PA. 


deg 


U(0, 180) 


M300 


M 


L(10 13 , 10 15 ) 


In C200 




G(ln(4) ± 0.4) 


Redshift of 


source 3 


1/(1,4.5) 


Redshift of 


source 4 


(7(1,4.5) 


A2390 






Ax 


arcsec 


G(0± 1.5) 


Ay 


arcsec 


G(0± 1.5) 


A383 (see 


Section [8A] and Nil) 




9dm 




1/(1,2.5) 


q* 




see Nil 


m x 




G 2ct (0.9±0.1) 



Note. — U(x,y) denotes a uniform prior over the interval 
bounded by x and y. L(x,y) denotes a prior that is uniform in 
the logarithm. G((j, ± cr) denotes a Gaussian prior with mean fi 
and dispersion cr, while Gia denotes a Gaussian prior truncated at 
2<r. Positions Ax and Ay are given relative to the BCG; positive 
values indicate west and north, respectively. Position angles are 
measured east of north, t The intervals were determined based on 
initial lensing fits; see the text for the special case of A963. 



(i.e., those describing the DM halo and T*v) we chose 
uninformative priors that are broad and flat. A Markov 
Chain Monte Carlo (MCM C) method is use d to explore 



the large parameter space ( Julio et al.|2007 |. 

For each model proposed by the MCMC sampler, a 
likelihood is computed based on the full data set. Since 
we assume the errors in our measurements are indepen- 
dent and Gaussian, this is equivalent to summing \ 2 
terms based on the strong lensing, weak lensing, and stel- 
lar velocity dispersion constraints: 



2 2 

X =XSL 



XWL 



XVD- 



(6) 



XSL 



E 



{.Xi 



Dbs\2 



bs\2 



Y + (w - v? s ) 



(7) 



where (xi,yi) and (x° , y° ) are the predicted and ob- 
served positions, respectively, of a sin gle image, and th e 
sum runs over all multiple images (see I Julio et al.|2007[) . 
In two clusters somewhat different techniques were used. 
In A383, x|l was calculated in the source plane when we 
include kinematic data, due to the slower two-int egra l 
dynamics we compute only in this system (Section |8.1[ ). 
We verified this has a minimal effect on the results, sec- 
ondly, in A963 the merging images that form the tan- 
gential arc could not be clearly separated (Section |4.2[ ). 
We therefore identified a symmetry point and required 
that the critical line pass through it, with a positional 
uncertainty of / .'2. We also imposed Gaussian priors of 
e = 0.21 ± 0.02 (the pseudoellipticity introduced below) 
and P.A. = (86±3)°, based on the shape of the isophotes 
at the radius of the tangential arc, since the break point 
provided to Lenstool cannot constrain them. 

The uncertainty in the image positions cT pos is a key 
quantity when combining strong lensing with other data 
sets. Although compact images can in principle be lo- 
cated in HST imaging with an astrometric precision of 
< 0'.'05, cluster lens models are generally not able to re- 
produce image positions to better than a ~ 0'.'2 — 0'.'3, 
with a scatter of up to ~ 3 " in the best-studied clusters 
(e.g., Limousin et al.|[2007 |. This is likely partly due to 
perturbations by unmodeled subs tructures, either i n the 

An 



2010) 



cluster or along the line of sight ( Julio et al. 
additional factor is that simply-parametrized models are 
not perfect representations of real or simulated clusters. 
This is particularly important when combining diverse 
data: since strong lensing constraints are exquisitely pre- 
cise, assigning a very small positional uncertainty can 
fully constrain the model, leaving the other data little 
room to influence the fit. Given that strong lensing, weak 
lensing, and stellar kinematics contribute comparably to 
the logarithmic radial extent of our study, it is critical 
not to overly concentrate the weight of the data in one 
radial interval. We find that <7p OS = 075 strikes a rea- 
sonable bal anc e and adopt this for our analysis (except 
see Section 7.3 on A2390). For the same reason, we have 
generally not imposed the detailed substructure of arcs 
as constraints on the model. 
Weak lensing constraints are incorporated by the term 



Xwl 



E 



~obs \ 2 i / n ™ ~obs \ 2 

9i,i ) + (92,imwL - 92A ) 



The strong lensing analysis is conducted in the image 



(8) 

where (g^f,g2?) is the observed reduced shear polar 
g = 7/(1 — re) for galaxy i, (gi t i, <?2,i) is the model reduced 
shear at the angular position and photometric redshift of 
galaxy i, and the factor m^i incorporat es ou r shear cal- 
ibration. Based on the results in Section [3.2[ we assign a 
Gaussian prior of 0.89 ± 0.05 to towl- The uncertainty 
G g is dominated by the intrinsic ellipticities of galaxies 
( "shape noise" ) and is estimated using the standard devi- 
ation in shear measurements far from the cluster centers 
to be a g = 0.32. 

Only the halo is considered in the weak lensing mod- 
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eling, since the mass is DM-dominated on > 100 kpc 
scales. The ellipticity of the halo in the plane of the sky 
is incorporated us i ng th e "pseudo-elliptical" formalism 
of Golse & Kneib ( 2002 1 , in which the ellipticity is in- 
troduced in the lens potential. Using their notation, we 
derive 



measurement, we adopted a lower ov 



073 for these 



7i,. 



|7| cos(20 £ 



en 



72, e = sin(20 e )VT" 
n e — k + e\j\ cos(2</> £ ), 



(9) 
(10) 
(11) 

where 71^,72,0 and K e are the shear components and 
convergence for the elliptical model, and I7I and k are 
the corresponding values for a circular lens. (See also 
Dumet-Montoya et al. 2012 ) As described by Golse & 
Kneib| ( |2002[ ), the pseudoellipticity e is approximately 
the ellipticity of the potential and not that of the sur- 
face mass density, which is about twice as large. The 
pseudoelliptical formalism is also used for the strong lens- 
ing modeling. It is a reasonable approximation for the 
moderate elliptici ties e < 0.3 encountered in our sample 
( [Sand et al.|[2008| . 

Finally, we compute the l.o.s. velocity dispersions ui os 
using the spherical Jeans equation. We assume the BCGs 
are completely pressure-supporte d, co nsistent with the 
lack of observed rotation (Section |6.1|: 



v*{r)M(r)F(r) 



„2-2/3 a 



dr. 



(12) 



By default we conside r isotropic orbits with /3. 



and T{r) = y/r 2 - R 2 ( |Cappellari|2"008l ). Here 1/* and E 







are the density and surface den sity profiles of the stellar 
tracers, as measured in Section 5.1 and M(r) is the to- 
tal mass (stars and DM) enclosed within a radius r. In 
A383, axisymmetric two-integral dynamical models are 
used due to the significant l.o.s. elongation of this c lust er. 
These are described fully in Nil (and see Section 8.1 1. 

The observational effects of seeing an d the slit width 
are included following Sand et aL| (2004). The model cti os 
are spatially binned to match the extraction apertures 
used for the data. These constraints are incorporated as 



Xvd 



A? 



(13) 



where <7j and of are the model and observed l.o.s. dis- 
persions in bin i, r espectively, and A^ is the uncertainty. 



As discussed by|Sand et al. (2004 2008), a spherical 



treatment is a good approximation to trie dynamics of 
the galaxies in our sample, which have a mean axis ratio 
(b/a) = 0.8. Furthermore, detailed local studies find that 
massive, non-rotating ellipticals are intri nsically close to 
spherical and have low aniso tropy (e.g., Gerhard et al. 
2001 |Cappellari et al. 2007). We discuss the effects of 
introducing miid orbital anisotropy into our dynamical 
models in Section [9731 



7.3. Alignment between the halo center and the BCG 

In order to locate the center of the DM halo, gNFW- 
based models in which the center of the halo was allowed 
to vary from that of the BCG were fit to the lensing 
data, taking a Gaussian prior with a = 3" along each 
axis. Since we are concerned only with an astrometric 



fits only. The inferred offsets between the centers of the 
halos and BCGs are given in Table [I] They are typically 
~ 1 — 4 kpc with a la uncertainty or~ 1 — 3 kpc, roughly 
consistent with the typical offset between the BCG and 
X-ray centroid. Given that the offsets are small and of- 
ten not significant, we have fixed the center of the halo 
to that of the BCG in the following analysis. This allows 
for a consistent lensing and dynamical analysis. We note 
also that the P.A. of the DM halo is close to the BCG 
light in all cases, never differing by more than 14° in 
projection. (Given that BCGs often exhibit ~ 15° gra- 
dients in position angle, such small differences are not 
completely well-defined.) 

The one exception to the above is A2390. While its 
lensing and kinematic data can be well-fit when the halo 
center is fixed to the BCG, the resulting models demand 
an unusually high Y^v (logaiMF > 0.42 at 95% con- 
fidence). Given the possible complexities in the mass 
distribution in A2390 described in Section [2j we consid- 
ered it prudent to increase the freedom in this model and 
allow the center of the halo is to vary slightly from the 
BCG. We took a Gaussian prior having a = 175, based 
on the lensing analysis described above. The positional 
uncertainty <7 pos was also relaxed to 170. (Nonetheless, 
the best-fitting models still reproduce the image posi- 
tions with a fidelity of 075, as shown in Section M) 

8. COMPARISON BETWEEN LENSING- AND 
X-RAY-DERIVED MASS PROFILES: CONSTRAINING 
THE LINE-OF-SIGHT ELLIPTICITY 

Before turning to the cluster mass distribution over 
the full radial range 3 — 3000 kpc spanned by our com- 
plete data set, in this section we first consider fits based 
only on strong and weak lensing, excluding velocity dis- 
persion constraints, and compare these to independent 
X-ray measures. As we describe below, the combination 
of projected and three-dimensional mass measures allows 
us to constrain the l.o.s. geometry of the clusters in our 
sample. 

Lensing directly probes the gravitational potential pro- 
jected along the l.o.s., whereas the ICM follows the three- 
dimensional potential. Mock observations of simulated 
clusters show that to a remarkable degree, X-ray ob- 
servations are able to recover spherically-a veraged mass 
profiles with a scatter of only ~ 5 — 10% (|Nagai et al 
2007| JLau et al.||2509| |Meneghetti et al.|[20TDbj |Rasia 



This is true even when a spherical geom 

The same 



et^.|[20T2 ) 

etry is (incorrectly) imposed in the analysis 



et al. 



simulations show that X-ray masses are biased slightly 
low due to non-thermal pressure support arising primar- 
ily from bulk gas motions. This bias is generally esti- 
mated to be only ~ 10%, although this depends on the 
detailed physics inclu ded in the simulati ons and may be 
somewhat higher (see Rasia et al. 20121. As argued in 
Nil, when much larger discrepancies between X-ray- and 
lcnsing-derived masses are encountered in relaxed clus- 
ters, they most likely arise from elongation or compres- 
sion of the mass distribution along the l.o.s. 

By comparing projected (lensing- or Sunyaev- 
Zel'dovich-based) and nearly spherical (X-r ay) mass 
measures, the l.o.s. shape can be inferred (e.g.,|Piffaretti 
[12005] |De Filippis et al ||2005HSereno 
di et al.l 12010 20111 INewman et al. 



et al. 



2003; jGavazzil 
2006 IMorand 
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TABLE 8 

NFW Parameters Derived from X-ray and Lensing Analyses 



Cluster 


r s (kpc) 


X-ray 

C200 


Source 


r a (kpc) 


Lensing 
C200 


(Strong + weak) 
logM 2 oo/M 


r 2 oo (kpc) 


MS2137 


180+J° 


8 19+ - 54 
°' 19 -0.56 


S07 


H9+32 


1 1 ns+ 2 - 81 

j.i.uo_ 2 3g 


14.561°;!? 


1318 


1-140 
-107 


A963 


390+"° 


4 7 q+0.84 
'-0.77 


S07 


197+« 


7 91 +1.59 
' - z -0.94 


i4.6iirj 


1430 


f 127 
-151 


A383 


4701"° 


3-8°_ 7 5 


A08 


260l|| 


6 "51 +°' 92 
°- 01 -0.81 


14 89+0-09 


1691 


H28 
-102 


A383 (prolate) 








372l°f 


4 40+0. 50 
^'^ a -0.48 


14.80 ±0.08 


1665 


f 107 
-95 


A611 


o 9n +200 




S07 


317l|5 


r: c: fi +0.65 
o ' OD -0.60 


14.92 ± 0.07 


1760 


+97 
-89 


A2537 




4.861?:" 


S07 


4 42 +46 


4 63+ 35 


15.12 ± 0.04 


2050 


+65 
-69 


A2667 


700 +479 
(uu_ 207 


3-02l ;« 


A03 


,zo -109 


9 QQ+O' 32 

z.a»_ 27 


15.16 ±0.08 


2164 


+137 
-129 


A2390 


7c 7 +1593 
'°'_393 


3.20li-J 7 


A03 


76 o + 119 


3 24+0-35 
°- z -0.31 


15 34+O.O6 


2470l{^ 



Note. — All X-ray fits are to the total gravita ting mass an d have been standardized to the same cosmology. Sources: S07 = |Schmidt| 
|fc Allen||2007| , A08 = | Allen et H] \2008) , A03 = |AIlen et aT] j2003|. The A383 (prolate) row show a fit to lensing and X-ray data using 
triaxial isodensity surfaces ( liquation Il4| and see Nil); we report sphe 



report sphericalizcd NFW parameters. 



2011| |Morandi & Limousin|2012[ ) assuming that the ICM 
is near equilibrium. This is important for the present 
analysis given that stellar kinematics reflect the three- 
dimensional gravitational potential. Coupling projected 
(lensing) and kinematic measurements can thus lead to 
errors if the l.o.s. shape of the cluster is highly aspheri- 
cal and this is not taken into account via an independent 
probe, such a s X-ray data (see discussions in Nil and 
Gayazzi||2005[ )p1 

Assuming for simplicity that one of the principal axes 
of the cluster is along the l.o.s. (2- axis), the surface den- 
sity derived from the lensing can be deprojected onto 
isodensity surfaces with coordinates 



r = Vi 1 - + (1 + £ s)y 2 + (z/q) 2 



(14) 



Here es and q parametrize the projected and l.o.s. ellip- 
ticity, respectively. For a "spherical" deprojection, q = 1. 
Note that lensing precisely measures the projected ellip- 
ticity but does not itself constrain q. 

In order to compare our lensing results to X-ray analy- 
ses, we have compiled the results of several studies listed 
in Table [8] X-ray studies typically adopt a paramet- 
ric form for either the density or temperature profiles, 
and these studies adopted an NFW profile to represent 
the total density. For a clean comparison, it is thus ap- 
propriate to restrict to NFW models for the dark halo 
when fitting the lensing data in this section. Further, 
since X-ray studies generally do not separately model 
the BCG, we include only the dark halo in the lens- 
ing mass measurements below; this ha s a minor effect 
outside the innermost bin. Figure [12] shows the ratio 
M\ ens /Mx of the spherically-enclosed mass Mi ons derived 
from lensing by assuming a spherical deprojection, to 
the mass Mx bas ed o n X-ray analyses. The inner er- 
ror bars in Figure 12 reflect the statistical uncertainty, 
which for the lensing mass is derived from the Markov 
chains. Estimating the uncertainty in the X-ray-based 
mass at a given radius cannot be done precisely with 
published NFW parameters, since the covariance is usu- 

12 As described in Nil, the dynamical problem is more complex 
due to velocity anisotropy. As a result, the problem is acute only if 
the dark halo and stellar tracers have significantly different l.o.s. el- 
lipticities and this is neglected. If the tracers and total mass are 
stretched equally along the l.o.s., the reduction in density is nearly 
balanced by the boosted velocities of stars moving along the major 
axis, and the projected velocity dispersions change fairly little. 



ally not given. We therefore estimated this using the full 
A383 mass profile provided by S. Allen (private commu- 
nication), including properly propagated errors, rescaling 
the errors based on the X-ray flux and exposure time as 
appropriate for Poisson-dominated formal errors. This is 
sufficiently accurate for our purposes given that system- 
atic unce rtai nties are comparable. The larger error bars 
in Figure 12 include an additional 10% systematic contri- 
bution added in quadrature that refle cts uncertainties in 
the Chandra temperature calibration ( Reese et al.|2010" ). 

In general the agreement between the X-ray- and 
lensing-based masses assu ming a spherical deprojection 
is very close, as Figure 12 and Table M demonstrate. A383 
is clearl y di screpant, with Mi ens > Mx; as discussed in 
Section 8.1 and Nil, this can be explained by a prolate 
halo that is elongated along the l.o.s. For the remaining 
6 clusters, however, the mean trend 

M lens /M x = (1.07 ±0.01) - (0.16 ±0.04) log r/100 kpc 

_ . (15) 

(dashed in Figure |12| errors are random only) is con- 
sistent with unity within the systematic uncertainty of 
ss 0.1. None of these 6 clusters show systematic devi- 
ations larger than \M\ ens /Mx — 1| ^ 0.2 over scales of 
50 — 600 kpc. At r ~ 100 kpc, where strong lensing fixes 
the mass, the spherically-deprojected mass Mi ens scales 
roughly cx q 6 for an NFW profile with the range of r s 
encountered in our sample. Therefore, the similarity of 
the X-ray and lensing measures implies that \q— 1| < 0.3 
in these systems, with the mean l.o.s. ellipticity being 
smaller ((q — 1) w 0.1 — 0.2). The asphericity will be yet 
smaller if some of the elevation of Mi ens /Mx is not due to 
geometry but to non-thermal pressure in the ICM, which 
is expected. 

Strong-lensing-selected clusters as an ensemble are 
sometimes thought to be biased toward clusters elon- 
gated along the l.o.s., since this orientation boosts the 
lensing cross-section. Given that l.o.s. elongation and 
non-thermal pressure support would both act to elevate 
M\ ens / Mx, our results show that the clusters in our sam- 
ple must be both close to hydrostatic equilibrium and not 
strongly elongated along the l.o.s. (excepting A383). We 
note that our sample consists of fairly massive clusters. 
Since any compression or elongation along the l.o.s. is 
constrained to be both small and consistent with null 
within the systematic uncertainties, q = 1 is fixed for 
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Fig. 12. — The spherically-enclosed mass Mi ens derived from a 
strong and weak lensing analysis, assuming a spherical deprojec- 
tion, is compared to that derived from published X-ray studies, 
Mx. The random and total (including a 10% systematic estimate) 
errors are reflected in the inner and outer error bars, respectively. 
Note that measurements at various radii are not independent, as 
they are derived from two-parameter NFW models. T he d ashed 
line indicates the mean trend excluding A383 (Equation \15\ . 

the remainder of our analysis in all clusters except A383, 
which is discussed individually below. The effect on our 
results of vary ing q within the allowed limits is discussed 
in Section [9~3) The good agreement between lensing and 
X-ray masses further supports our contention that we 
have selected relaxed clusters (Section [2]). We discuss 
the mass-con cent ration relation described by our sam- 
ple in Section [TO 



8.1. The case of A383 

As shown by Nil and i ndependently confirmed by 
Morandi & Limousin (2012), A383 is significantly elon- 
gated nearly along the l.o.s. This is unique in our sample 
and necessitates a special treatment for A383 in several 
ways. Since the method was detailed in Nil, only a 
summary is provided here. The l.o.s. ellipticities of the 
dark halo goM and BCG stars are included as addi- 
tional parameters. An additional Xx t erm incorporates 
the spherical ly-averaged mass profile derived from the X- 
ray analysis ( Allen et al. 12008 ) into the likelihood, which 
constrains Qdm as described above. Since the projected 
ellipticity (1 — b/a ~ 0.1) is much less than that along 
the l.o.s., the stellar dynamics can be approximated us- 
ing a prolate axisymmetric model with the symmetry 
axis along the l.o.s. (Using a more sophisticated model 
of the intracluster medium, Morandi & Limousin (20121 
also showed that the major axis is close to the l.o.s., 
with a separation of 21 ± 10°.) By accounting for the 
l.o.s. geometry, Nil showed that the lensing, kinematic, 
and X-ray data can be brought into agreement. 

8.2. Comparison to previous lensing results 

All of the clusters in this sample have been the sub- 
ject of previous lensing analyses. Although a compari- 
son with each of these earlier works is impractical, and 
in most cases our NFW parameters agree with previous 
determinations within the errors, we wish to note a few 
cases that have been the source of some confusion in the 



literature. Firstly, in N09 we reported a high concentra- 
tion (C200 = 10-0 ± 1.1) for strong lensing-based fits in 
A611, which was in tension with weak lensing estimates 
at the time and is higher than the present measurements. 
This is attributable to t he revised spectroscopic redshift 
discussed in Section l4~4l 



Secondly, Gavazzi 



p005[ G05) studied the mass dis- 
tribution in MS2137 using strong and weak lensing and 
reported substantial differences between lensing- and X- 
ray-based mass models. They inferred that a significant 
elongation along the l.o.s. was a likely explanation. Our 
lensing results instead agree closel y with a recent anal- 
ysis by Donnarumma et al. (2009). They are also con- 
sistent with X-ray measurements by Schmidt & Allen 
(20071, which is incompatible with the highly prolate 
shape (q ~ 2) suggested by G05. This likely arises from 
a numerical error in the G05 results (R. Gavazzi, private 
communication) . 

9. THE TOTAL DENSITY PROFILE 

In this section we combine strong and weak lensing 
with resolved stellar kinematic measures within the BCG 
to constrain the radial density profiles of the 7 clusters 
in our sample over r ~ 3 — 3000 kpc. We remind the 
reader that the present paper is concerned with the total 
density, inclusive of dark and baryonic matter, not that 
of the DM alone as in our earlier works ( Sand et al.|2002 



|2004||2008[|Newman et al.|2009[|2011| ), which is discusser 
in Paper ll. 

9.1. Mass models and fit quality 



The top panel of Figure 13 shows the total density 
profiles ptotM that are inferred using gNFW (solid) and 
cNFW (dotted) models for the halo. The colored bars at 
the bottom illustrate the radial extent of each data set, 
which taken together provide coverage over most of the 
3 decades in radius plotted. Correspondingly, the mass 
models are tightly constrained over the entire range. Fur- 
thermore, the density profiles derived using gNFW and 
cNFW models (Section [7]) are virtually identical. This 
demonstrates that the derived density profiles do not 
strongly depend on the particular parametrization of the 
halo. For this reason, we focus on the gNFW-based mod- 
els for the remainder of the paper. 

Given the simple parametrization of the mass distribu- 
tion, it is important to verify that good fits are achieved 
to the wide range of data. The middle and bottom panels 
of Figure n3] demonstrate that, in all statistically 
acceptable fit to the weak lensing and stellar kinematic 
data is obtained. The quality of the strong lensing fits is 
shown in Figure [7J in which the positions of the multiple 
images in the best-fitting model are indicated as crosses, 
and in Table [9j which lists the rms error in these posi- 
tions. The image positions are typically matched within 
0'.'5, which is fairl y typical of other s tudies using sim- 



ilar models (e.g., Richard et al. 2010). In some cases, 



the best models predict images that were not included 
as constraints because they could not be unambiguously 
identified (Section El), particularly when buried in clus- 
ter galaxy light, but no predicted counterimages lack a 
plausible identification when one should be observable. 

In A2667, the modeled shear arising from the second 
mass clump located at R ~ 1.4 Mpc has been subtracted 
from the data points in Figure [l3j Nevertheless, the 



24 



Newman et al. 





10 9 




10 8 




10 7 




10 6 




10 5 




10 4 


pc ; 


10 3 






© 


10 8 


o 


10 7 








10 6 




10 5 




10 4 




10 3 



gNFW 
cNFW 



MS2137 



A2537 



Stellar kinematics'".- 
Multiple images 
Weak lensing 




10 100 1000 10 100 1000 10 100 1000 

Radius [kpc] 




100 



1000 



100 



1000 100 
Radius [kpc] 



1000 




10 1 
Circularized radius [kpc] 

Fig. 13. — Total density (top), tangential reduced shear (middle), and velocity dispersion (bottom panel) profiles for fits to lensing and 
stellar kinematic data. In all panels the shaded region and dotted lines indicate the 68% confidence intervals for the gNFW and cNFW 
models, respectively. Top: The radial intervals spanned by each data set are indicated. Middle: The shear averaged in circular annuli is 
shown for display purposes, although elliptical models are used throughout the quantitative analysis. For A2667, the shear from the second 
clump is subtracted as described in the text. Bottom: Model dispersions (shaded and dotted) include the effects of seeing and the slit 
width; the dashed line shows the mean gNFW model excluding these effects. The extraction radii of the data have been circularized. 
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TABLE 9 

Quality of Fit to Strong Lensing Constraints 



Cluster 


gNFW cr img 


cNFW cr img 


MS2137 


0'.'44 


0'.'43 


A383 


0'.'46 


0'.'60 


A611 


IX' 60 


0'.'58 


A2537 


IX' 65 


0'.'66 


A2667 


(X'29 


0'.'27 


A2390 


0'.'53 


0'.'60 



Note. — C7i mg indicates the rms error in the image plane posi- 
tions of lensed sources in the best-fitting model. In A963 a single 
constraint on the critical line location is used; it is thus fit exactly. 

measured shear exceeds the model at R > 2 Mpc, which 
may indicate a more complex mass distribution near the 
virial radius. The fit quality at smaller radii and the close 
agreement with X-ray measurements reassure us that the 
mass is well-modeled within ~ 2 Mpc 

9.2. The total inner density slope 

The top panel of Figure [M] shows that the density pro- 
files of these clusters are similar in their inner regions. 
At very small radii < 0.003r2oo ~ 5 kpc, the density pro- 
files often steepen. As we describe in Paper II, this is 
where the density becomes stongly dominated by stars. 
However, outside this innermost region the slopes of the 
total density profiles are quite comparable. To quantify 
this similarity, we introduce a measure of the total in- 
ner slope 7tot = — d\ogptot/d\ogr. Since the BCG and 
the DM halo are modeled as distinct components, 7 to t 
is not a directly inferred parameter. We define it by fit- 
ting a line in the log r — log ptot plane over the interval 
r / r 200 — 0.003 — 0.03, illustrated at the top of the panel, 
with errors derived by repeating this for many models in 
the Markov chains pj For the median r2oo in our sample 
(Table[8]), the corresponding interval is 5 — 53 kpc, or typ- 
ically ss 0.2— 2R e in terms of the effective radius R e of the 
BCG. The endpoints of this range are well-constrained 
by stellar kinematics and strong lensing, and therefore 
7t t is observationally robust. 

The bottom panel of Figure [14] shows the probability 
distributions of 7tot, which is well-constrained for each 
cluster, with a typical formal la uncertainty of 0.07. In 
order to characterize the mean inner slope and its scat- 
ter, we assume that the distribution of 7tot in the parent 
population of massive, relaxed galaxy clus ters is Gaus- 
sian. F ollowing the formalism described by |Bolton et al.| 
\2012\ , we infer a mean (7 tot ) = 1.16 ± 0.05+^ (er- 
rors are random and systematic, respectively, with the 
latter described below) and an intrinsic scatter ct 7 = 
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Fig. 14. — Top: Spherically-averaged profiles of the total den- 
sity, normalized by the virial radius T2oo (TablepJ and the critical 
density p c (z c i us ). Uncertainties are shown in Figure n3l The range 
over which the inner slope 7tot is defined is shown at the top of 
the panel. Arrows at the bottom indicate the three-dimensional 
half-light radii of the BCGs. Bottom: Marginalized probabil- 
ity densities for the inner slope 7tot of the total mass distribution, 
measured over (0.003 — 0.03)r200- The thick curve shows the in- 
ferred parent Gaussian distribution, as described in the text. The 
top of the panel in dica tes the effects of introducing mild orbital 
anisotropy (Section|9.3|l. 



+0.06 
-0.04- 



0.10 

centration C200 = 4.5 typical of our sample has a slope 
[3 = 1.10 (pdm oc r - ' 9 ) over the same radial interval. In- 
terestingly, the mean total density slope in our sample 
is therefore consistent with that expected of CDM-only 
halos, wit h fair ly small scatter. We return to this point 
in Section |10.2[ where we make comparisons to numeri- 
cal simulations. In Paper II, we investigate correlations 
of 7tot with other properties. 

Systematic errors may c ontri bute additional scatter in 
the measurements (Section |9.3[ ), which would imply that 

13 Grid points are logarithmically spaced and equally weighted. 



the true intrinsic scatter is yet smaller. For this reason, 
a conservative approach is to view ct 7 as an upper limit: 



For comparison, an NFW profile with con- er 7 < 0.13 (68% CL). These results are not very sensitive 



to the precise radial interval over which the slope is mea- 
sured. Taking r/r 2Q0 = 0.005 - 0.003 or 0.003 - 0.05, for 
example, only shifts (7tot) within its la uncertainty. 

Figure [15] illustrates the uniformity of the inner mass 
distribution via a different metric, demonstrating a con- 
nection between the mass on very small scales of 5 kpc 
and the mass of the cluster core within 100 kpc. In Pa- 
per II we show that stars typically compose 75% of the 
mass within 5 kpc, whereas the mass on 100 kpc scales is 
almost entirely DM. Despite this and the small range in 
these masses within our sample - each roughly a factor 
of 2 - we detect a probable correlation (Pearson correla- 
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Fig. 15. — Comparison of mass contained within the central 
5 kpc, comprising mostly stars, to that within the virial radius 
(top, showing no correlation) and within 100 kpc (bottom, show- 
ing a positive correlation), which are both dominated by DM. Error 
ellipses (lcr) indicate the formal model uncertainty. Error bars in 
the bottom panel estimate t he s ystematic uncertainties due to or- 
bital anisotropy (see Section [9.3| l and projection effects (Section^. 

tion coefficient r = 0.70, two-sided P — 0.08) Fj Using 
the same techniques as above, we estimate (-M5/100) = 
-2.24 ± 0.04±g;g| (random and systematic, as above) 



and an intrinsic scatter om = 0.09 



+0.06 
-0.04 



dex, where 



-M5/100 = logM(< 5 kpc)/M(< 100 kpc) is the ratio of 
spherically-enclosed masses. (Accounting for additional 
scatter from systematic errors would further reduce <7m-) 
The top panel of the figure shows that, in contrast, there 
is no correlation when the virial mass of the cluster is 
considered instead. As we discuss further in Section [T0| 
this can be understood if the innermost regions of the 
present BCG and cluster halo were in place at early times 
and changed little in mass subsequently, with accretion 
mostly adding mass to the outer regions to grow the BCG 
and the cluster halo. 

9.3. Systematic uncertainties 

Our dynamical model s are based on isotropic stellar or- 
bits. Prior studies (e.g., Carter et al.|1999 Gerhard et al. 



|200H C appcll ari et al.| 2007[ and see references in Sec- 
tion^. 3D have shown this to be a good approximation for 



14 The errors in the two masses are not strongly correlated, which 
is expected given that the mass estimates on 5 kpc and 100 kpc 
scales are dominated by independent kinematic and lensing mea- 
sures, respectively. 



luminous, non-rotating ellipticals in their central regions, 
with a possible tendency toward slightly radial orbits. 
We reran our analysis using a constant /3 a niso = +0.2 
(radial bias) or —0.2 (tangential bias) in the dynamical 
calculations, where /3 a niso = 1 — da I at characterizes de - 
viations from isotropy (e.g., Binney & Tremaine 1987 1 . 
The mean shifts in 7 to t were —0.07 and 0.05, respectively. 
This could be a common bias among the whole sample. 
Variable anisotropy could also introduce spurious scatter 
in the measured Ytot at the same level; in that case, the 
true physical scatter would be less. 

Since we measure kinematics well within the effective 
radii of the BCGs, taking |/3 an iso| = 0.2 corresponds to 
changes in <7i os by ~ 5 — 10% for the same mass dis- 
tribution. This is larger than the systematic erro rs of 
< 5% in the measurements themselves (Section 6.2), and 



therefore the resulting errors are less than from those 
from anisotropy. Furthermore, most of the systematic 
measurement errors are probably not correlated across 
all BCGs. Errors arising from the spherical dyna mical 
treatment are expected to be similarly small (e.g., Kro 
nawitter et al.|2000 Jiang fc Kochanek|2007 1 for nearly 



round systems like our sample. 

Spherical masses estimates derived from lensing will 
be biased if the cluster is elongated or compressed along 
the l.o.s. In Section [8] we found a mean tendency for the 
lensing mass to exceed that derived from X-ray measure- 
ments by 7% at 100 kpc. Although this is consistent 
with zero within the uncertainties in the X-ray calibra- 
tion, a 7% bias in the spherically-averged mass profile 
would shift (7tot) by only —0.03. Cluster-to-cluster vari- 
ation with \q— 1| < 0.3 (Section[8]) could introduce scatter 
of cr 7 < 0.08; accounting for this would again lower the 
inferred intrisic scatter. 

Our analysis assumes that the stellar mass in the BCGs 
follows the light measured at ~ 6000 A, i.e., that Y*v 
does not vary with radius. Color gradients indeed a ppea r 
to be small in the majority of the sample (Section |5.1[ ), 
but two BCGs (A611 and A383) show a stronger gradi- 
cnt. We take A383 as an example. Assuming that the 
near-infrared light measured in the F160W filter is a bet- 
ter proxy of the stellar mass, we applied a radial gradient 
to the model stellar mass profile based on the ratio of the 
F160W and F606W fluxes. For the same tracers, the ve- 
locity dispersions should change by < 4%, less than the 
systematic uncertainty in the measurements. This is be- 
cause the M*/Lf606W gradient becomes significant only 
at large radii where DM is dominant. We also tested 
the impact of the BCG size r cut by perturbing it by its 
10% uncertainty in A2537 and repeating our analysis, ac- 
counting for the correlated change in Ly. This led to no 
si gnificant shifts in T*v, /3, or b (see also the discussion 



Sand et al.||2l)04 > 



Considering the combination of the above uncertain- 
ties, we estimate that on a cluster-by-cluster basis there 
is an additional error of ~ 0.10 in -f t ot beyond the for- 
mal random estimates, which are comparable in mag- 
nitude. Not all of this error budget is coherent across 
the full sample: the largest source of global systematic 
bias is likely the orbital distribution. Thus, we take the 
uncertainty A7 tot = —0.07, +0.05 arising from orbital 
anisotropy as the sytematic error in the mean (7tot)- 
Finally, we wished to explore the impact on our dynam- 
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ical analysis if the BCG is not precisely at rest in the cen- 
ter of the halo. As discussed in Section|2] the X-ray cen- 
troid and the lensing center are generally quite close to 
the optical center of the BCG. However, small offsets of a 
few kpc are not excluded. In order to assess how the stel- 
lar dynamics could be affected by small-scale oscillations 
around the center of the cluster potential, we performed 
some simple numer ical simulations using the parallel TV- 
body code FVFPS ( |Londrillo et al.||2003| |Nipoti et aL 
20031. The BCG is modele d as a smgle-co mponent equi- 
librium isotropic 7 model (Dehnen 1993) with 7 = 1.5, 
scale radius a = 23.5 kpc (i.e., 3D half- mass radius 
r half ~ 40 kpc), and total mass M* = 1.5 x 10 12 M Q , 
representative of the BCGs in the observed sample. The 
galaxy was realized wi th N c± 2 x 10 5 particles following 
the same procedure as Nipoti et al. (|2003 
softening parameter e = 0.0 



Mt 



2009J , using a 
le beginning of each 



simulation the galaxy is placed at a distance r o g S0t from 
the center of a fixed gravitational potential representing 
the cluster DM halo, either at rest or in a circular or- 
bit. We explored two halo models: a steep halo (7 = 1, 
a = 352 kpc) that approximates an NFW profile with 
a* = 1.52 x 10 6 M Q kpc~ 3 and r s — a (see Equation 
[2S within the scale radius, and a shallow halo (7 = 0.5, 
a = 226 kpc) that approximates a gNFW profile with 
13 = 0.5, p s = 5.37 x 10 6 Mq kpc" 3 , and r s = a. The two 
models were chosen to nearly match at r > 100 kpc but 
differ in their inner slope. 

In the halo with the steeper NFW-like cusp, we found 
that small displacements - even up to 40 kpc - are highly 
unstable. Even when initially set on a circular orbit, the 
BCG quickly falls to the halo center within 350 Myr. 
During this time the isophotes are clearly disturbed, 
which is inconsistent with the galaxies in our sample. 
In the halo with a shallower density cusp, on the other 
hand, we found that stable oscillations with an ampli- 
tude of ~ 5 kpc are possible. During these oscillations, 
the central velocity dispersion varies from that attained 
by the same system with a stationary BCG (at the clus- 
ter center) by only a few percent. We conclude that small 
offsets between the BCG and cluster center do not pose a 
significant problem for our Jeans analysis. Furthermore, 
if the small offsets are genuine, they appear to imply a 
dark matter cusp with f} < 1. 

10. DISCUSSION 

We now consider the physical implications of our re- 
sults and compare our measured density profiles to recent 
simulations. After discussing the mass-concentration re- 
lation, we turn to evidence for a uniform total inner den- 
sity slope and compare to both DM-only simulations and 
those that include baryons. We conclude with a discus- 
sion of the processes that may be responsible for estab- 
lishing this observed uniformity. 

10.1. The mass-concentration relation 

Figure [16] shows the mass-concentration relation for 
our sample^ which was derived from NFW fits to the 
gravitational lensing data in Section [HJ Halo concentra- 
tions are generally expected to vary inversely with mass, 
due to lower background densities at the l ater epochs in 
which more massive halos assemble (e.g., Bullock et al. 
2001 Wechsler et al. 2002). The more massive clusters 
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Fig. 16. — Mass— concentration relation based on strong+weak 
lensing (contours; 68% and 95% confidence) and X-ray (points with 
marginalized lcr error bars) analyses for the full sample. Emp irical 
| |Schmidt fc Allen 2007 Ok abe et ahp OlO Ogur i et al. 1 2012b and 
theoretical relations j^rada et al.||^0l^| |z*riao et al. 2UU9] |I5uffy| 
|et al.|2008| with shading indicating the lrr scatter J are shown tor 
comparison, standardized to the same overdensity. Dashed con- 
tours for A383 show the effect of adopting a prolate halo, which 
brings the lensing and X-ray measures into agreement (Section|8.1[l. 



(M200 1O 15 M ) in our sample have concentrations 
in line with the predictions of most numerical simula- 
tions, although we note that current simulations do not 
have the necessary volume to provide good statis tics in 

pOT2l, who 



pr 

this regime. The exception is |Prada et al. 
surprisingly have reported an increasing concentration 
at higher masses^] However, as we move toward lower 
mass the concentrations become significantly higher than 
CDM simulations. MS2137, in particular, has a quite 
high concentration inferred from both lensing a nd X-ray 
measuremen ts , which has long been recognized (Gavazzi 
et al. 2003). The effect is to produce a significantly 
steeper slope in the mass-concentration relation com- 
pared to CDM simulations. Interestingly, the steep slope 
defined by our sample a grees w ell with measuremen ts by 
Schmidt h Allen|(|2007l X -raxL jOkabe et al.| poTo] weak 
lensing), and (Jgun et al. ( 2012[ strong and weak lens- 
ing). 

Lensing-based concentrations could potentially be bi- 
ased high for two reasons. Firstly, projection effects can 
cause an upward bias if the major axis of the cluster is 
near the line of sight. This is an unlikely to be a major 
effect in our sample given the overall good agreement 
between the lensing- and X-ray-based measures (Sec- 
tion [8]). Secondly, more concentrated clusters - partic- 
ularly among the lower-mass systems - are more likely 
to reach the critical surface density for forming multiple 
images, which is a necessary condition for entering our 
sample. Simulations of this potential bias suggest that 

15 Figure |l6| represents an extrapolation to higher masses than 
are contained in the simulations on which their model is calibrated. 
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the population of cluster lenses may have ~ 10 — 35% 



higher concentrations on average (Hennawi et al. 2007 



Fedeli et aL"1|2007| |Meneghetti et al.| |2010b| , but that 



highest concentrations seen in MS2137 and other c lus 



tions suggests that the net effect of adding baryons to 
the cluster core should mainly be to displace DM such 
that the total density does not change much, at least at 
radii 



ters (e.g., Broadhurst et al.|2008 Zitrin et al.|2011a | are 
still not explained. Baryon cooling is also generally ex- 
pect edj»jncrease_clust^ by only < 20% 
(e.g., puffy et al |2010l|Mead et al |20To| . Larger samples 



17 



> 5 — 10 kpc. In the right panel of Figure 
compare our results to several hydrodynamical simula- 
tions that include ba ryons , cooling, and star formation 



(Gnedin et al. 2004 G04; Somm er-Larsen & Limousin 
,20101 SLIP; |Martizzi et al.||2012[ M12). In general such 
of lenses (e.g., Postman et al. 2012a fand close compar- simulations suffe r from a well-known "oy e rcooling" prob 



isons with X-ray observations (e.g. 

should allow the significance of these trends to be verified 
or otherwise in the near future. 

10.2. The uniformity of the total inner mass 
distribution and comparison to simulations 

While the mass-concentration r elation has a signif icant 



orandi et al.|2010[) lem (e.g., G04; |McCarthy et al.||2010| |Teyssier et "al 



2011| ), in which the inability to suppress late cooling 



intrinsic scatter of er C200 ~ 25% ( Neto et al.||2007| and 
higher when measured only in projection], the shape of 
the density pro file is expected to be more uniform (e.g., 
Gao et al.|2012 ). Thus, if the goal is a precise measure of 
the shape of the mass profile, i.e., its logarithmic slope, 
sample size is secondary to the density and radial extent 
of observational constraints. The combination of data 
sets we have presented provides precise constraints over 
the full range of radial scales, and thus forms an excellent 
basis for detailed study of the density profile, particularly 
in the inner regions. 

The slope of the total density profile at small radii is 
very similar within our sample (Section [9]). In Figure 17 
we compare the measured density profiles, scaled by the 
virial radius ^oo, to recent numerical simulations. In 
the left panel these are overlaid on sphericall y-averaged 
density profiles from the Phoenix project ( |Gao et al. 
2012 1, the highest-resolution suite of iV-body simulations 
of clusters to date. The typical convergence radius of 
2.9 /i -1 kpc is well matched to our observations, as is 
the mass range M 20 o = 0.6 - 2.4 x 10 15 h" 1 M . In the 
following comparisons we omit Phoenix-G and H, which 
are the latest clusters to assemble and remain in a un- 
relaxed state to z = 0, inconsistent with the properties 
of the observed sample. This leaves 7 simulated clusters. 
The range of density profiles they span is illustrated by 
the grey band in the left panel of Figure [TTJ 

Remarkably, the observed total density profiles closely 
parallel the Phoenix clusters that contain only dark mat- 
ter, despite the fact that the stellar mass in the BCG 
contributes noticeably within ~ 30 kpc (~ 0.02r2oo, com- 
parable to R e ). Since our parametric models for the DM 
halo have the same large-radius behavior as the NFW 
profile, similar behavior at r/r2oo ^ 0.3 is guaranteed. 
At smaller radii, however, the agreement is not trivial, 
since it results from a combination of the concentration 
and inner slope (r s and f3 or b) of the halo and the contri- 
bution of stellar mass (T*v). The high concentrations of 
MS2137 and A963 cause them to appear shifted leftward 
of the Phoenix clusters in this plot, but even in these 
cases the slope of the density profile is similar. The bot- 
tom of the panel indicates the radial intervals over when 
the models are constrained by the various data setsP°| 
The similarity of the observations to DM-only siiriuTa- 

16 Minor "wiggles" appearing r/r'200 ~ 10 ~ 2 should not be over- 
interpreted given that we lack constraints there and the mass model 
parametrization is simple. 



leads to the formation of far too much stellar mass at the 
cluster center. The build-up of baryons then leads to a 
significant contraction of the halo, increasing the central 
DM density. Thus in the G04 and M12 "AGN off" simu- 
lations, the central densities are much too high; even the 
density of DM alone (not plotted) exceeds the measured 
total. SL10 estimated the effects of overcooling through 
an ad hoc simulation in which late-forming stars were 
slowly removed following z — 2 (their "Rz2" runs). This 
ameliorates the problem but still leaves a steeper total 
density slope than observed, with j iot — 1.5. M12 per- 
formed a very high-resolution simulation that included 
feedback from an AGN. Interestingly, the AGN is effec- 
tive not only at quenching late star formation but also at 
ejecting DM from the center. The latter is accomplished 
through several mechanisms that M12 discuss, including 
rapid fluctuations in the potential due to expulsion of gas 
during AGN outbursts] 1 | and adiabatic expansion from 
the slower removal of gas in the quiescent AGN mode. 
The process is rather too effective, as it results in a 10 kpc 
stellar co re that is much bigger than the largest observed 



example (Postman et al. 2012b ). Still, this work points to 
a possibly important role for the supermassive black hole. 
We note that, except for SL10, the simulated clusters dis- 
cussed here are less massive (M 2 oo — 1 — 4 x 10 14 M Q ) 
than the observed sample. 

Currently, it appears that iV-body simulations based 
only on gravity produce more realistic total density pro- 
files than any hydrodynamical simulation. The similarity 
of the total density slope is quantified further in the left 
panel of Figure |18[ which compares the 7 to t measure- 
ments of individual clusters, along with their inferred 
parent distribution (dashed; see Sectional), to the inner 
slopes of the Phoenix clusters defined in the same man- 
ner. The mean slope 7t t = 1.13 ± 0.02 in the Phoenix 
simulations agrees well with measured total density 
slope: (7tot) = 1.16 ±0.05 (random) i ; 07 (systematic). 
The intrinsic scatter in 7 to t is possibly larger in the obser- 
vations, but this difference cannot be asserted with much 
certaint y du e to the systematic limitations discussed in 
Section |9.3| We stress again that these are DM-only 
simulations and that their relevance to the total mass in 
real clusters over this range of radii is surprising. Other 



lensing (e.g., Umetsu et al.| 


2011 Morandi et al. 


2011 


Zitrin et al. 201 lb;; Morandi & Limousin 2012) and X.- 


ray (Lewis et al.| 2003 


Zappacosta et al.||2006|) studies 



close to an NFW profile; we review these further in Pa- 
per II. 

17 This is similar to the mechanism suspected of pro ducing cores 
in dwarf galaxie s, fueled in that case by supernovae ( jPontzen fc| 
|Governato|2012) . 
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Fig. 17. — Left: Total scal ed density profile s for the full sample (colored lines) are compared to simulated clusters — containing only DM 
- from the Phoenix project jGao et al. 12012k. The dashed line shows the mean of the 7 simulated Phoenix clusters, while the grey band 
outlines the envelope they denne. Observed profiles are plotted down to 3 kpc. The radial range spanned by each data set is indicated at 
the bottom, and the interval over which -ytot is defined is shown at the top of the panel. Note that the density has been multiplied by r 2 
to reduce the dynamic range; thus, an isothermal slope p oc r -2 is horizontal. Right: The observed total density profiles (thin lines, as 
in left panel) are compared to several hydrody namical simulations that include ba ryons, cooling, and feedback. The |Gnedin et ah| ( |2004] l 
results are taken from their Figure 2, while the Sommcr-Larsen & Limousin (2010) curves refer to their Coma "Rz2" simulation. 
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Fig. 18. — Left: The total inner density slope -ytot is shown for each cluster (colored lines, following the legend in Figure \ Vf\ and 
compared to the slopes derived for the Phoenix clusters (DM-only simulations, black histogram). For the individual observed clusters, 
inner arrows and t he fu ll length of each line indicate the 68% and 95% confidence intervals, respectively. The inferred Gaussian parent 
population (Section |9.2| is shown by the dashed curve, with the grey band showing the 68% confidence region. The effects of mild orbital 
anisotropy on the observational results are illustrated at the top of the panel. Right: Velocity dispersion data, as in Figure [TT] normalized 
for each cluster by (Ti n B at R = 3 kpc by i nterpolating the measurements. Diamonds indicate measurements in NGC 6166, the BCG of 
A2199 (z = 0.03), from |Kelson et al.||2002| |. The solid line shown the mean of the mass models for the full sample, normalized in the same 
way, while the dashed line shows the same if observational effects (slit width and seeing) are excluded. 

The uniformity of the total inner mass distribution 
is futher supported by the striking homogeneity in the 
shapes of the velocity dispersion profiles. The right panel 
of Figure 18 plots these profiles normalized to the ob- 
served dispersion at R = 3 kpc. With this single scaling, 
the velocity dispersion profiles for all 7 clusters are mu- 
tually consistent within their uncertainties. In this figure 
we also co mpare to the BCG o f the nearby cluster A2199 



(note that we cannot resolve these scales due to the slit 
width and seeing) pj Although r ising a profiles in cD 
galaxies have been observed since Dressier (1979), t here 
has been some uncertainty (e.g. , l^'isher et a . 



binski||1998| |Carter et aL||1999| |Hau et al.||2UU4p about 



5 Du- 



the frequency of this phenomenon, which is the expected 



The 



[z = 0.03; Kelson et al. 2002). Where the data overlap radii > 2 5 kpc, beyon. 



Kelson et al 



they are consistent with our sample, except at < 1 kpc 
where the black hole is probably dynamically significant 



1 |2002| ) data are higher than our models at 
a the outer limits of our velocity dispersion 



measures but within the range of ~ 30 — 100 kpc where strong 
lensing constrains the mass. This could indicate that the dynamical 
structure becomes less homogeneous near R e . 
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response of stars to the central cluster potential. Our 
observations suggests that it is ubiquitous in BCGs that 
are well-aligned with the centers of relaxed clusters. 

10.3. A physical picture 

In ACDM-based models the formation of BCGs is ex- 
pected to occur relatively late and be dominated by dry 
(dissipationless) merging (e.g., De Lucia k Blaizot|2007[ ). 
Since NFW profiles are simply the product of collisionless 
collapse and merging, one interpretation of our findings 
is that the processes that set the inner density profile in 
clusters are primarily gravitational. Understanding how 
the total density profile rem ains similar to that expe cted 
of CDM alone is not trivial. Loeb k Peebles ( |2003 ) and 
|Gao et ah] (2004) hypothesized that repeated merging 
might drive the total collisionless (stars and DM) den- 
sity toward an NFW-like profile, noting that this could 
solve two puzzles: the lack of very high-dispersion galax- 
ies with a e > 400 km s , and our own earlier observa- 
tions that the DM density profile is shallower than the 
NFW form in cluster cores ( [Sand et al.| 2002[ |2004[). As 



a starting point, based on both analytic arguments and 
CDM simulations, they showed that the mass in the cen- 
tral regions of present-day massive clusters changes very 
little at z < 6, but the identity of these particles changes 
considerably. The particles arriving in mergers displace 
those already present, maintaining the central density. 

In reality we expect the progenitors of the BCG and 
the infalling galaxies to have been compresse d due to 
baryon loading (e.g., 



references in Section 



1986 and see 



Blumenthal et al. 

Indeed, the total density pro- 
files we derive do appear to steepen slightly in the inner 
~ 5 — 10 kpc. In Paper II, we show that this is the 
regime in which the stellar density exceeds that of DM. 
Although it is difficult to pinpoint this scale precisely, it 
is certainly well within the present effective radius (me- 
dian (R e ) — 34 kpc), where stars begin to contribute 
non-negligibly to the total density. Furthermore, this 
scale bears a striking similarity to sizes of the most mas- 
sive galaxies at high redshift, which many observations 
now indicate are quite compact (e.g., Trujillo et al.|2006| 
van Dokkum et al.||2008| | Williams et al.||2010| |JNewman| 



et al.||2012p . For example, a simple extrapolat i on of the 
observed st ellar mass-size relation at 
et al.|2012[ ) would yield a size of R 



z 
2- 



i 2.5 (Newman 
6 kpc for likely 

progenitorsFj Indir ect support for this comes from our 



observation (Figure 15) that the mass contained in the 
inner 5 kpc - mostly stars, but only a small fraction of 
the stars in the present BCG - is well-correlated with 
the mass of the cluster core within 100 kpc, which is 
also expected to be in place by z s» 3 and change rela- 
tively little subsequently (G04, Figure 1). Interestingly, 
color gradients in BCGs (when present) occur mostly at 
R > 10 kpc, while the innermost regio ns are more homo 



geneous in both color and luminosity ( Postman k Lauer 
1995] |Bildfell et al.|[2008 |. 



This suggests a picture in which stars in the innermost 
~ 5 — 10 kpc are formed early within the BCG progeni- 
tor, where dissipation establishes a steep stellar density 

19 We caution that low-z BCGs do not lie on a simple extrapo- 
lation of the trend defined by lower-mass ellipticals (e.g., [von der| 
|Linden et al.|p007} , but the situation for the very most massive 
galaxies at nign-z is uncertain due to the small volumes surveyed. 



profile, while subsequent dry merging of infalling satel- 
lites mostly adds stars to the outer regions of the BCG in 
a manner that nearly maintains the total density. This 
requires that the stars and DM arriving in mergers dis- 
place a roughly equal amount of existing DM. Simula- 
tions indeed indicate that stars arriving in minor (low 
mass ratio) mergers, which dominate the accretion his- 
tory of very massive galaxies , are primarily added to the 



outskirts o f the BCG (e.g., Naab et al. 2009 Laporte 



et al.|2012 ). However, the precise effect of these mergers 



on the DM already in place is not clear. Using dissipa- 
tionless A^-body simulations, several authors have shown 
that dynamical friction of the infalling satellites on the 
halo can "heat" the cusp and reduce the central D M den- 
sity (e.g., |El-Zant et alj2004] |Nipoti et al.|2004| and see 
references in Section 111, and that this can more than 
overcome the deeper central potential that results from 
the central build-up of baryons. Thi s process is sensitive 



to the nature of the satellites ( e.g. , [Ma k Boylan-Kolchin 
2004[|Jardel k Sellwood|2009[ ), and a fully realistic treat 



ment has been lacking in simulations to date. Satellites 
will bring in their own DM, counteracting this central de- 
pletion. Tightly-bound galaxies are more effective, since 
they are more resista nt to stripping and so survive longer. 
Laporte et al. ( |2012 ) point out that the compact stellar 
configuration observed in hig h-z massive ga l axies is sig- 
nificant in this context, while Martizzi et al. (2012 1 show 
that infalling central black holes are also important in 
their simulations. 

In this scheme there is little room for additional con- 
traction or steepening of the mass profile, and the rel- 
evant physics is primarily dissipationless. In contrast, 
a major focus in the theoretical literatu re has been 
the "adiaba tic contraction" (AC) formalism (Blu menthal 
et al.||1986|) and its modified versions (e.g., |Gnedin et al. 

which predict the steepening of the inner 



DM profile resulting from the slow cooling and central 
condensation of baryons. In contrast to the scheme de- 
scribed above, in which the orbital energy lost by in- 
falling stellar clumps is transferred to the halo, the en- 
ergy lost by baryons is radiated and lost from the system; 
thus, the AC model em phasizes the role of dissipa tion in 



forming the BCG (e.g., |Lackner k Ostriker|j20l0| ) . This 
model takes no account of mergers at all. The scenario 
advanced above argues that while one may be able to fit 
the results of simulations by introducing additional para- 
maters to the model, this does not necessarily mean that 
the most releva nt underlying phys ics are being accurately 
described (e.g., Duffy et al. 2010). The relevance of AC 
in describing the results of cosmological simulations with 
gas probably reflects the known overcooling problem that 
cause the effects of dis sipation to be overstated. As dis- 
cussed in Section fl0.2[ the inclusion of AGN appears to 
solve most of this problem and may additionally drive 
DM out of the center. 

An alternative possibility is that the central DM den- 
sity is reduced relative to CDM simulations due to the 
nature of the DM particle. For example, a small self- 
interaction cross-sec tion could produce moderately shal- 
lower density cusps (Spergel k Steinhardt 2000| jYoshida| 
et al ||2000[ |Rocha et~al.||2012[ |Peter et al.||2012p , which 
could then leave room for baryons to boost the total den- 
sity to an NFW-like profile. We discuss the likelihood of 
these scenarios further in Paper II, where we measure the 
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inner DM density slope. 

The results we present for BCGs are quite different 
from observations of massive field ellipticals, which uni- 



radii that is nearly isothermal (p tnt oc r 2 ; 


Treu et al. 


2006 


Koopmans et al. 


2009 


Auger et al. 


J009). The 



verting their baryons i nto stars (e.g., Guo et al. 



2010 



Leauthaud et al.||2012[ ). As a consequence, BCGs are 
much more DM-dommated, so it is not surprising that 
dissipation would play a lesser role in their formation. 
The greater importance of minor mergers in t heir assem- 
bly may also be important ( Naab et al.|2009 ). Thus, our 
results do not directly conflict with studies claiming that 
adiabatic con traction may be significant in lower-mass el - 
lipticals (e.g., Dutton et al. 2011 Sonnenfeld et al.|2012 ). 
They do show it the currently discussed prescriptions tor 
halo contraction are valid, they have a limited range of 
applicability that likely varies with star- format ion effi- 
ciency and assembly history. Although the isothermal 
and collisionless (NFW) limits have often been discussed 
as special configurations in the literature, we should be 
able to see an intermediate case in galaxy groups. In- 
deed, this m ay have already be observe d by |Spiniello| 
et al.| ( |201l| and |Sonnenfeld et al.||2012[ ). 

Finally, we emphasize that these results are fully con- 
sistent with our previous claims that the DM density 
slope in cluster cores is shallower than an NFW profil e 



( |Sand et al |2002 |2004M2008[ |Newman et al.|2009| 2011 
given that the subtraction of stellar mass from an NT 
like profile must yield one with a shallower inner slope. 
They also explain previously reported discrepancies be- 
tween our results and independent analyses in the same 
clusters that are confined to radii where the stellar mass 
is negligible (r > R 



e.g. 



Morandi & Limousin 2012) 



Quantifying the DM profile requires techniques tor accu- 
rately separating the stellar and dark mass. We describe 
these in Paper II. 

11. SUMMARY 

We presented observations of a sample of 7 massive, 
relaxed galaxy clusters at (z) = 0.25. The data comprise 
25 multiply-imaged sources (21 with spectroscopic red- 
shifts, of which 7 are original to this work) that present 80 
images, weak lensing constraints from multi-color imag- 
ing, and spatially-resolved stellar kinematics within the 
BCGs. Taken together, these data from the HST, Sub- 
aru, and Keck telescopes extend from ~ 0.002r 2 oo to be- 
yond the virial radius, providing detailed constraints on 
the global mass distribution. 

1. We find that the clusters in our sample are not 
strongly elongated along the line of sight (except 
A383) and that their intracluster media are close 
to equilibrium, based on the agreement between 
mass profiles derived from independent lensing and 
X-ray observations (Section [8]). 

2. Physically motivated and simply parametrized 
models provide good fits to the full range of data. 
The inner logarithmic slope of the total density 
profile measured over (0.003 — 0.03)r 2 oo (on av- 
erage, 5 — 55 kpc) is remarkably uniform, with 
( 7tot ) = 1.16 ± 0.05 (random) torn (systematic) 



and an intrinsic scatter 
at 68% confidence). 



10+ 006 

u - lu -0.04 



(fJ 7 < 0.13 



3. Supporting the uniformity of the inner mass dis- 
tribution, the extended stellar velocity dispersion 
profiles show a clear rise with radius and display a 
very homogeneous shape after a single scaling. 

4. The shape of the total density profile is in surpris- 
ingly good agreement with high-resolution simula- 
tions containing only CDM, despite a significant 
contribution of stellar mass within the BCG over 
the scales we measure. Hydrodynamical simula- 
tions including baryons, cooling, and feedback cur- 
rently provide much poorer descriptions. 

5. Our findings support a picture in which an early 
dissipative phase associated with star formation in 
the BCG progenitor establishes a steeper total den- 
sity profile in the inner w 5 — 10 kpc - comparable 
to the size of very massive, red galaxies at z > 2 
- while subsequent accretion of stars (still within 
the present effective radius) mostly replaces DM so 
that the total density is nearly maintained. 

6. These results are fully consistent with our earlier 
claims that the slope of the dark matter profile is 
shallower than NF W- type cusp in the innermost 



regions of clusters (|Sand et al 
2009 



Newman et al 
separating the dark an 



2002 

20111). In Paper 



2004 2008 



we turn to 
aryonic mass profiles. 
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Positions of multiple images 
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APPENDIX 

Table 10 lists the angular positions and redshifts of the multiply-imaged sources used in our strong lensing analysis. 



